Good TREs work

University College London (ucl) projects

4852 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).


🚩 University College London (ucl) was sent multiple files from the same dataset, in the same month, both with optouts respected and with optouts ignored. University College London (ucl) may not have compared the two files, but the identifiers are consistent between datasets, and outside of a good TRE NHS Digital can not know what recipients actually do.

Evaluation of aid to diagnosis for congenital dysplasia of the hip in general practice: controlled randomised trial — DARS-NIC-309509-L2G1J

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-12-18 — 2026-12-17 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: GREAT ORMOND STREET HOSPITAL FOR CHILDREN NHS FOUNDATION TRUST

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care (HES APC)

Objectives:

Great Ormond Street Hospital for Children NHS Foundation Trust (GOSH) requires access to NHS England data for the purpose of the following research project: Hip Dysplasia Screening Programme (HipDys study)

The following is a summary of the aims of the research project provided by GOSH:
“Developmental dysplasia of the hip (DDH) is one of the most common congenital abnormalities and early diagnosis is key for successful treatment. Infant hips are initially examined at birth; however, this cannot always detect cases of DDH, therefore all infants undergo a second examination with their General Practitioner (GP) at 6-8 weeks of age. Despite both of these hip checks, 1-2 in 1000 children are still diagnosed late with DDH. As a result, more than 2000 hip replacements are performed every year in the UK because of DDH. It can be harmful both to miss DDH and to incorrectly diagnose infants as having DDH.

“It is not well understood why children that do not have DDH are incorrectly diagnosed as having DDH and why some who actually have DDH are not detected early enough. Studies suggest that this may be related to the examiner’s knowledge, skills and/or the way the hip check consultation is conducted. The Hip Dysplasia Screening Programme (HipDys study) seeks to address these disparities to improve the ability of GPs in evaluating infants’ hips using a diagnostic aid, which in turn could yield a financial benefit to the NHS by way of eliminating unnecessary referrals to secondary care, appointments and subsequent treatments for DDH. More specifically, the main aim of this randomised control trial is to determine whether the diagnostic aid for DDH reduces the number of clinically insignificant referrals from primary to secondary care, as well as reduce the number of DDH cases diagnosed late.

“GPs from 172 GP practices across England, who carry out the 6-week hip check on infants between 42 and 70 days old, will be divided into two groups. Eligible participants will be identified by general practice patient registers and infants will be invited to attend a 6-week check at the practice. One group of GP practices will be given the diagnostic aid, comprising of a video tool and a checklist (HipDyS checklist) to use in all hip checks they carry out. The other group will screen for DDH as normal, without the use of the HipDyS checklist. The two groups will then be compared to see if the first group better identified infants with DDH than the second group. Researchers from the study will also evaluate whether using the checklist reduces costs for families around trips to doctors or hospitals, and costs to the NHS.”

The following NHS England data will be accessed:
• Hospital Episode Statistics
o Admitted Patient Care – necessary because inpatient data will provide details of any inpatient procedures conducted by orthopaedics (e.g. surgery) following the 6-week hip check for infants who have undergone their 6-week check.

The level of the data will be:
• Identifiable – necessary because although the applicant will pseudonymise the data before analysis, Date of Birth is required to ensure that the data received is matched to the correct infant.

The data will be minimised as follows:
• Limited to a study cohort of approximately 16,720 infants between 42 and 70 days old, invited for the 6-week hip check at participating GP surgeries in England.
• Limited to data between 1st December 2020 to June 2026. For each individual patient, data will only be provided from the provided hip check date and will be restricted up to the 2nd anniversary of said hip check date.

The study has received Section 251 support from the Confidentiality Advisory Group (CAG) (19/CAG/0198) to access information on the 6-week hip check for infants who meet that criteria. The study team seek to get this information through the HES inpatient (admitted patient care) data set, as well as through central monitoring. The alternative to following up with these infants would be to visit all 172 practices and the hospitals that infants have been referred to and going through the records of at least 110 infants per GP practice, at 2 years however, the study does not have the resources (e.g. staffing or financial) to achieve this.

GOSH is the research sponsor and the controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The study is in the public interest as it aims to reduce unnecessary costs to the NHS (thus enabling the NHS to relocate investment) and it aims to reduce unnecessary stress on affected infants and their parents/guardians.

The funding is provided by the National Institute for Health and Care Research (NIHR). The funding is specifically for the HipDyS study as described.

University College London (UCL) is a processor acting under the instructions of GOSH. UCL’s role is limited to storing and processing the data in line with the study protocol (as determined by GOSH).

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS’ role is limited to secure back-up of data stored in UCL’s Data Safe Haven.


Under the HipDyS study umbrella, a qualitative study will examine the effects of implementing the trial intervention in practice and includes interviews and non-participant observations with a sample of GPs, parents/carers of infants and hospital consultants by a researcher from Kings College London (KCL), who is leading the qualitative study with the University of Bedfordshire. The qualitative study does not include data from NHS England and neither KCL nor University of Bedfordshire have any influence over the NHS England data under this Data Sharing Agreement.

Data will only be accessed by individuals who are substantive employees of UCL.

GOSH will take advice from patient and public involvement (PPI) groups as to the best way to disseminate results to relevant service users. The PPI group are comprised of professionals and parents from GOSH and the STEPs charity (https://www.stepsworldwide.org/). The PPI group have played a vital role in designing the trial. They advised and collaborated with the study to put together the ethics application and developed study materials including the on-line training video for GPs (part of the diagnostic aid), the study questionnaires and the interview schedules for parents/carers and GPs. The PPI group were also involved in the review of the findings from prior research linked to this randomised controlled trial. The PPI group meet with the study team at various points throughout the trial with the outcome of these meetings feeding into the trial’s steering group meetings.

Expected Benefits:

At present, the number of referrals that are deemed clinically insignificant are higher than those deemed clinically significant. In addition, every 1-2 in 1000 children are lately diagnosed with DDH. The primary aim of this trial is to reduce the number of clinically insignificant referrals to secondary care (referrals that result in immediate discharge and no diagnosis). Additionally, the trial aims to reduce the number of late diagnoses of DDH and increase the accuracy of clinically significant referrals (referrals that result in treatment/monitoring), to improve health outcomes for infants found to have DDH.

Having access to HES data will allow the study team to track children that were assessed using the HipDyS trial checklist and compare their data to those who were not assessed using the checklist. This comparison will allow the trial team to assess whether the diagnostic aid reduces the number of insignificant and or/ late referrals to secondary care. As such, the diagnostic aid will provide a structured approach for GPs to the examination of infants within primary care, thus utilising the referral pathways for those truly in need, with the potential of allocating NHS resources more efficiently in the long term and relieving the impact on the already strained NHS service.

Similarly, the intended benefit to infants and their parent/carers are:
• Alleviate unnecessary worry for infants that would have otherwise been incorrectly diagnosed as having DDH
• Improve health outcomes of infants who do truly have DDH, as they’re able to be seen and treated by a specialist quicker
• Reduction or elimination of travel costs, to unnecessary hospital appointments, for parent/carers of infants that would have otherwise been incorrectly diagnosed.

The use of routine electronic health records within multicentre trials has the potential to both reduce the impact of additional trial-specific GP Practice visits for both clinical research teams and, significantly reduce the costs of central trial coordination and data collection.

If the results of the study are favourable, the team hope to make the diagnostic aid available to GPs by mid 2027 for use when conducting 6-week checks, alongside other current government guidance they may use on conducting these checks.

Outputs:

Trial findings will be disseminated in the following ways:

1. Presentations at national and international academic conferences to ensure members of the academic community within paediatrics, orthopaedic and behavioural psychology (regarding the qualitative and Health Economics studies) are informed These conferences include; The European Paediatric Orthopaedic Society conference, British Society for Children’s Orthopaedic Surgery, Paediatric Orthopaedic Society of North America, International Society of Behavioural Medicine conference and International Society of Behavioural Medicine conference

2. Publication in biomedical journals: the trial will be reported in accordance with the CONSORT (Consolidated Standards of Reporting Trials) statement (www.consort-statement.org). GOSH aim to publish its results in a high-impact general medical journal. GOSH will also contact the free publications received by most UK GPs to ask them to publicise the results. These journals include: The Journal of Bone & Joint Surgery, Journal of Children’s Orthopaedics, Lancet, British Medical Journal, The Journal of Pediatrics, Translational Behavioural Medicine, Implementation Science Journal and British Journal of Health Psychology

3. Royal Colleges: GOSH will ensure the primary care community, physicians and nurses are informed of the results through links with the Royal Colleges of General Practitioners; Surgeons; Paediatrics & Child Health.

4. NHS: All results will be communicated to NHS England and to local Integrated Care Boards, especially the Clinical Reference Orthopaedics and Paediatrics, of which the chief investigator is a member.

5. National Institute for Health and Clinical Excellence (NICE), Clinical Reference Group Orthopaedics NHS England, British Society of Childrens Orthopaedics: All will be aware of trial results.

6. Service Users: GOSH will actively communicate with patient and public involvement (PPI) groups

7. Press: UCL and Great Ormond Street have press offices, which connect medical journalists throughout the global media.

8. Websites: GOSH will publicise results on the UCL website. GOSH will also approach Great Ormond Street Hospital Childrens Charity to link to other web sites.

9. Support groups such as STEPs or MumsNet to make members of the public who have an interest or are affected by hip problems aware of the trial results and their care could be impacted.

10. Several co-applicants have international reputations and may be asked to lecture on various aspects of primary care, paediatric orthopaedics, and research methodology, thus introducing new practices into the relevant medical curriculum.

The specific journals and conferences mentioned will allow for coverage of key stakeholders in the setting of primary care, clinical trial design and clinicians involved in orthopaedics research. The aim is to begin presenting results and submitting papers to journals within months of trial results being established, approx. December 2026.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

UCL will transfer data to NHS England. The data will consist of identifying details (specifically NHS Number, Date of Birth) and a unique person ID for the cohort to be linked with NHS England data.

NHS England data will provide the relevant records from the HES Admitted Patient Care dataset to UCL. The data will contain directly identifying data items (Date of Birth) which are required to confirm the correct link at record level with data already held by the recipient. Additionally, the data flow from NHS England will contain a unique person ID which can be used to link the data with other record level data already held by the recipient.

The data will not be transferred to any other location.

The data will be stored on servers at UCL.


Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The Data will be accessed by authorised personnel via remote access.
The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.
For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisation’s DSPT (or other security arrangements as per this DSA) and complies with the organisation’s remote access policy.
The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The data will not leave England at any time.

Access is restricted to individuals within the department of UCL who have authorisation from the study team. All such individuals are substantive employees of UCL.

GOSH is not permitted to access the data.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.

The data will be linked at person record level with the reported study data obtained from GPs.

The identifying details will be stored in a separate database to the linked dataset used for analysis. All analyses will use the pseudonymised dataset. There will be no requirement and no attempt to reidentify individuals when using the pseudonymised dataset.

Researchers from the PRIMENT Clinical Trials Unit (https://www.ucl.ac.uk/priment/home/priment-clinical-trials-unit) within UCL will process the data for the purposes described above.


A phase III, double blind, placebo controlled, randomised trial assessing the effects of aspirin on disease recurrence and survival after primary therapy in common non metastatic solid tumours ( ODR1718_261 ) — DARS-NIC-656806-N9V7N

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2024-01-15 — 2025-01-14 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. NDRS Cancer Registrations
  2. NDRS Linked Cancer Waiting Times (Treatments only)
  3. NDRS Linked DIDs
  4. NDRS Linked HES AE
  5. NDRS Linked HES APC
  6. NDRS Linked HES Outpatient
  7. NDRS National Radiotherapy Dataset (RTDS)
  8. NDRS Systemic Anti-Cancer Therapy Dataset (SACT)

Objectives:

Univeristy College London (UCL) requires access to NHS England data for the purpose of the following research project:
A phase III, double blind, placebo controlled, randomised trial assessing the effects of aspirin on disease recurrence and survival after primary therapy in common non metastatic solid tumours.

The following is a summary of the aims of the research project provided by UCL:

A clinical trial to find out whether taking aspirin daily for 5 years after treatment for an early-stage cancer, stops or delays the cancer coming back, which is testing if routine data sources can replace traditional approaches to patient follow up.

The primary outcomes of this study are:
- Overall survival for all participants
- Invasive disease-free survival for breast cancer
- Disease-free survival for colorectal cancer
- Disease-free survival for gastro-oesophageal cancer
- Biochemical recurrence-free survival for prostate cancer
Secondary outcomes are: In all participants these will include adherence, toxicity including serious haemorrhage, and cardiovascular events, as well as some tumour site-specific secondary outcome measures.


The following NHS England Data will be accessed:
• NDRS Linked Hospital Episode Statistics
o Admitted Patient Care
o Accident & Emergency
o Outpatients
• NDRS Linked Cancer Registration
• NDRS Linked Diagnostic Imaging Dataset (DID)
• NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
• NDRS Radiotherapy Dataset (RTDS)
• NDRS Linked Cancer Waiting Times (Treatment Data) (CWT)

The level of the data will be pseudonymised.

The Data will be minimised as follows:
• Limited to a consented study cohort identified by UCL – UCL will provide the NHS Numbers and cancer site/morphology codes for the data required.
• Limited to data starting from trial initiation in 2015.
• For each individual patient, data will only be provided from date of enrolment into the trial and until 10 years after their completion in the trial.
• Limited to the following geographic areas: England.


UCL is the research sponsor and the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

Although Tata Memorial Centre (TMC) is the study sponsor, Tata Memorial Centre (TMC) will not carry out any controllership activities nor have the ability to limit or supress outputs

The lawful basis for processing personal data under the UK GDPR is:

Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.


The funding is provided by Cancer Research UK and NIHR HTA (Health Technology Assessment Programme). The funding is specifically for the stud described. Funding is in place until 31 August 2028.

The funder(s) will have no ability to suppress or otherwise limit the publication of findings.


Amazon Web Services provides backup services to UCL and will store the Data as contracted by UCL.
VIRTUS provides IT support to UCL.


No one else other than UCL listed above will be accessing the data.


Data will be accessed by a PhD student affiliated with UCL. The individual has completed mandatory data protection and confidentiality training and is subject to UCL’s policies on data protection and confidentiality. The individual accessing the data will do so under the supervision of a substantive employee of UCL. UCL would be responsible and liable for any work carried out by the individual. The PhD student would only work on the data for the purposes described in this Data Sharing Agreement (DSA).


There are 8 Public and Patient Involvement representatives with the Add-Aspirin trial who all support this research.

The national data opt-out does not apply where explicit consent has been obtained from the patient for the specific purpose.

Where individuals have opted out of disease registration by the National Disease Registration Service (NDRS), their data has been permanently removed from the registry and therefore will not be disseminated under this Data Sharing Agreement (DSA). https://digital.nhs.uk/ndrs/patients/opting-out

Yielded Benefits:

A previous publication has detailed the process and practical considerations for accessing routinely-collected healthcare data for use in clinical trials to support other research groups in undertaking such work (Macnair et al. Trials (2021) 22:340). Update as per latest confirmation report submitted 09 Jan 2024: Publication in a peer-reviewed cancer research journal reporting on the comparison between trial-specific follow-up and routinely-collected health data for ascertaining key outcomes within the Add-Aspirin trial (an ongoing large phase III RCT in multiple cancer types). Submission planned for November 2023 with expected publication date Q1 - 2 2024. The above publication, along with the previous publication on this work (see below), have the potential for wide-reaching implications for future conduct of cancer clinical trials. There is currently much interest in use of routinely collected health data for improving the efficiency of cancer trials and reducing the burden on NHS teams and resources. These publications report on both the practicalities of obtaining and using such data within trials, as well as the extent to which it is fit-for-purpose for the assessment of key (cancer and non-cancer) outcomes.

Expected Benefits:

The above publication, along with the previous publication on this work (see below), have the potential for wide-reaching implications for future conduct of cancer clinical trials. There is currently much interest in use of routinely collected health data for improving the efficiency of cancer trials and reducing the burden on NHS teams and resources. These publications report on both the practicalities of obtaining and using such data within trials, as well as the extent to which it is fit-for-purpose for the assessment of key (cancer and non-cancer) outcomes.


The following are specific benefits to patients are expected as an outcome subject to the findings:
- Streamlining the data collection process through routine health data could reduce the overall cost of conducting trials. This may contribute to making innovative cancer treatments more accessible for patients.
- A comprehensive assessment and comparison of trial data and HSD may provide a more holistic understanding of the impact of treatments on patients' overall health and quality of life by tracking mortality, recurrence, new cancer diagnoses and major non cancer events.
It is also envisaged that these listed above will also lead to public benefits.

Outputs:

The expected outputs of the processing will be a publication in a peer-reviewed cancer research journal reporting on the comparison between trial-specific follow-up and routinely-collected health data for ascertaining key outcomes within the Add-Aspirin trial (an ongoing large phase III RCT in multiple cancer types). Submission planned for November 2023 with expected publication date Q1 - 2 2024.


The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.


The outputs will be communicated to relevant recipients through the following dissemination channels:
• Journals
• Public reports
• MRC CTU website


The target dates for production and dissemination of the outputs is Q1-Q2 of 2024.

Processing:

No more data will flow under this agreement. Historically UCL transferred data to NDRS (at Public Health England now NHS England). The data consisted of identifying details (specifically NHS Number and Date of Birth) for the consented cohort to be linked with NDRS data.

NDRS provided the relevant records from the NDRS (HESAPC), NDRS (HESAE), NDRS (HESOP), NDRS Cancer Registration, NDRS (DID), NDRS (SACT), NDRS (RTDS), NDRS (CWT) datasets to UCL.

The Data contained no direct identifying data items but contained a unique person ID which can be used to link the Data with other record level data already held by the recipient.

The Data will not be transferred to any other location. The Data will not leave England.

The Data will be stored on servers at Amazon Web Services (AWS).
AWS’s role is limited to secure back-up of data stored in UCL’s Data Safe Haven. UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

UCL uses offsite back-up services provided by AWS.

The Data will be accessed by authorised personnel on site and via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisation’s DSPT (or other security arrangements as per this DSA) and complies with the organisation’s remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England/Wales. The data will not leave England/Wales at any time.

Data will be accessed by individuals with an honorary contract with UCL. The individuals will act as an agent of UCL at all times under supervision from employees of UCL. Aside from this/these individuals, access is restricted to employees or agents of UCL who have authorisation from whom e.g. the Principal Investigator.

AWS and VIRTUS are not permitted to access the Data.

All personnel accessing the Data have been appropriately trained in GDPR/data protection and confidentiality.

The Data will not be linked with any other data outside of this agreement.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Researchers from the UCL will use the relevant subset of data to undertake the socio-economic analysis described above.


Understanding and improving the use of investigations in primary care in patients subsequently diagnosed with cancer (ODR1920_196) — DARS-NIC-656863-R0V6Q

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Other-The Health Service (Control of Patient Information) Regulations- Regulation 2

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-12-20 — 2026-12-19 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. NDRS Cancer Registrations
  2. NDRS National Cancer Diagnosis Audit (NCDA)

Objectives:

University College London (UCL) requires continued access to NHS England National Disease Registration Service (NDRS) National Cancer Registration and Analysis Service (NCRAS) data for the following research project:
Understanding and improving the use of investigations in primary care in patients subsequently diagnosed with cancer.

Before the dissolution of Public Health England (PHE), this request was managed by the PHE Office for Data Release (ODR) under the reference ODR1920_196. All data being retained under this Agreement was previously disseminated by the ODR.

The following is a summary of the aims of the research project:
1. To understand how often are primary care investigations used in patients before a cancer diagnosis, and which patient and disease factors are associated with their use?
2. To understand how the use of these investigations is associated with the length of diagnostic intervals and process and outcome measures of the diagnostic process.

The following NHS England NDRS NCRAS data will be accessed:
• NDRS National Cancer Diagnosis Audit (NCDA)
• NDRS Cancer Registrations
Access to the above datasets is necessary because they provide information that is integral to achieving the aim of the study.

The level of the data is pseudonymised.

The data has been minimised as follows:
• Limited to individuals the NDRS identify as being included in the NCDA 2014 cohort- as defined by the NCDA_Diagnosisdatebest field as patients diagnosed with any cancer type between January 1st 2014 and December 3st 2014 and who were registered at one of the participating general practices in England
AND;
• Limited to individuals the NDRS identify as being included in the NCDA 2018 cohort- as defined by the NCDA_diagnosisdatebest field as patients diagnosed with any cancer type between January 1st 2018 and December 31st 2018

UCL is the research sponsor and the controller is the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under UK GDPR is Article 9(2)(j) processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS’ role is limited to secure backup of data stored in UCL’s Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The study benefits from oversight/steering by the National Cancer Diagnosis Audit methodology/steering groups, with input from colleagues at the University of Exeter and Cancer Research UK. Neither CRUK nor the University of Exeter will have access to the data held under this Agreement.

Data will be accessed by:
• Undergraduate, Masters or PhD students affiliated with UCL. Any student working with the data held under this Agreement must have completed mandatory data protection and confidentiality training and are subject to UCL’s policies on data protection and confidentiality. Any students accessing the data will do so under the supervision of a substantive employee of UCL. UCL would be responsible and liable for any work carried out by students. These students would only work on the data for the purposes described in this Agreement.

In line with the national data opt-out policy, opt-outs are not applied because the data is not Confidential Patient Information as defined in section 251(10) and section 251(11) of the National Health Service Act 2006.

Where individuals have opted out of disease registration by the National Disease Registration Service (NDRS), their data has been permanently removed from the registry and therefore will not be disseminated under this Data Sharing Agreement (DSA). https://digital.nhs.uk/ndrs/patients/opting-out

Yielded Benefits:

The findings of the study have so far identified avenues for quality improvement activity relating to the management and referral of suspected cancer.

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making for policy-makers, local decision-makers such as doctors, and patients to inform best practice to improve the care, treatment and experience of health care users relevant to the subject matter of the study.

Through achieving the study’s aims the findings may allow the identification of opportunities for earlier diagnosis that can be obtained by use of blood and other tests (endoscopy, imaging) in primary care among patients who were subsequently diagnosed with cancer.

The study may allow researchers to obtain a granular understanding of variation between different patient groups in the use of diagnostic tests ordered by GPs, therefore identifying opportunities for improvement. The researchers additionally may gain an understanding of the predictors and implications of using or not using the test for the patients, to support the prioritisation of the implementation of the findings, and their incorporation into clinical audit and improvement initiatives.

It is hoped that through the publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients.

Outputs:

The expected outputs of the processing will be:
• Submissions to peer-reviewed journals by the end of November 2024
• Presentations at appropriate conferences such as the Society for Academic Primary Care or other national and international cancer research or diagnosis research conferences.

The outputs will not contain NDRS data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the datasets from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
• Journals
• Social media
• Public reports
• Posters displayed at appropriate conferences

All outputs are due for dissemination by November 2024.

Processing:

No data will flow to NHS England for the purposes of this Agreement.

No data is being disseminated under this Agreement. UCL wish to retain NDRS NCRAS data previously disseminated by the PHE ODR. The data contains no direct identifying data items. The data is pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.

The data will be stored on servers that support the UCL Data Safe Haven.

The data will not leave or be accessed outside the UK at any time.

Access is restricted to employees or agents of UCL who have authorisation from the study lead.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.

The data will not be linked with any other data.

There will be no requirement and no attempt to reidentify individuals when using the data.

Analysts from UCL will analyse the data for the purposes described above.


Childhood Outcomes after Perinatal Brain Injury (Data flowing to DfE) — DARS-NIC-475526-F3Z5H

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-11-20 — 2024-10-17 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: IMPERIAL COLLEGE LONDON, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Demographics

Yielded Benefits:

No data has been disseminated by NHS England for this research study. There are therefore no yielded benefits to date.

Expected Benefits:

This population study is hoped to provide the most complete picture of how children’s lives are affected by perinatal brain injury, providing essential information to answer parents’ questions accurately and in a meaningful family-centric manner. This information is intended to reshape clinical practice and facilitate optimum service planning within the NHS, to meet the needs of these children and their families through to adulthood, and ultimately improve their future health outcomes. An understanding of the sequelae of perinatal brain injury, specifically how and when children are affected, is expected to inform enhanced developmental surveillance across the NHS and enable the design of targeted multidisciplinary interventions to support children as needed. For example, premature infants (prone to inattention) can benefit from delayed school entry, Special Educational Needs (SEN) support, and educational packages raising awareness amongst educators of these specific challenges.

Stakeholder meeting with a view to impacting policy:
One of the project supervisors has a strong track record in bringing together key stakeholders on the issue of perinatal brain injury. Stakeholder meetings have previously been arranged with representatives from the Department of Health and Social Care, NHS Improvement, NHS England, neonatal doctors and nurses, and parent representatives; to discuss how perinatal brain injury should be defined in research. This project intends to utilise these existing connections to maximise the impact of this work.

UCL is looking to hold a stakeholder meeting in the final year of the fellowship, inviting representatives with whom the research team have established relationships, from the Department of Health and Social Care, the Department of Education, NHS England, BLISS, Meningitis Research Foundation, neonatal doctors and nurses, and parent representatives. It is planned that a Professor of Human Development from Oxford University with considerable experience in shaping social and education policy globally will be invited to join the stakeholder meeting, as well as a psychologist with considerable experience in designing educational interventions for preterm infants. The study results are intended to be discussed in this meeting alongside how these findings should shape policy and practice going forward. UCL are additionally anticipating to create an action plan to determine an effective strategy to disseminate this information to teachers and other professionals within the education sector, in collaboration with our stakeholders. This is hoped to ensure wide-spread awareness across the education sector, and that long-lasting sustainable measures are in place/ planned to support current and future children with perinatal brain injury, beyond completion of this fellowship.

Anticipated impact on neonatal care, society and NHS services:

Impact on neonatal care
• Equip healthcare professionals with reliable information to counsel families (target date 2024).
• Communication aids will facilitate meaningful family-centred conversations on the neonatal unit (target date 2024)
• Help prepare families for their child’s future and understanding what additional support may be needed (long-term)
• Encourage healthcare professionals to consider the long-term impact of various neonatal care decisions (long-term)

Impact on the NHS and policymakers
• Help those involved in shaping policy, resource planning and service provision to make informed decisions about how to most effectively support these children whilst maximising the efficiency of services (target date 2024-25)
• UCL findings are intended to inform national guidelines on follow-up after brain injury (target date 2024-25)

Impact on schooling and policymakers
• Equip parents with important information about the academic impact of brain injuries to help them plan their child’s future and support them with their educational needs (long-term)
• Provide key information and education to teachers about how they can support children with perinatal brain injuries (long-term)
• Help the Department for Education in determining resource allocation and the provision of additional educational support (long-term)

Outputs:

Academic outputs are hoped to include high-impact peer reviewed publications, and international conference presentations. Findings are expected to be submitted for publication in high impact general medical journals, such as the New England Journal of Medicine, the British Medical Journal, and JAMA Pediatrics. The study results are intended to be presented at international conferences such as the Royal College of Paediatrics and Child Health annual conference, the Kings Fund annual conference, and the Paediatric Academic Societies meeting in the USA.

Publications will be Open Access as per UCL policy, and freely available both on journal websites and via the UCL webpage. Outputs will contain only aggregate level data with small numbers suppressed in line with National Neonatal Research Database (NNRD), NHS England, Office for National Statistics (ONS) and Department for Education (DfE) policy and guidance. All data will be stored within the ONS secure research service (SRS) and all outputs from this server undergo independent checks by ONS staff to ensure outputs meet regulations and could not be deemed identifiable in any way.

Dissemination of the research findings to the public (parents who have children with a perinatal brain injury) are intended to be facilitated through existing collaborations with the Neonatal Data Analysis Unit (NDAU), BLISS (the charity for babies born sick or premature) and the Meningitis Research Foundation. UCL are also looking to also create an infographic/ information leaflet to improve communication of prognosis after perinatal brain injury between doctors and parents. Public dissemination is intended to include production of lay research reports publicised on the NDAU, BLISS, UCL and Meningitis Research Foundation websites. Research regarding neonatal outcomes has attracted a high level of media interest, and it is anticipated that this will be the case for the proposed study. UCL are acutely aware of the potential harmful effect of inaccurate or sensational reporting of research findings in this sensitive area, and the confusion and anxiety this can cause for affected families. UCL are planning to work closely with BLISS and Imperial College London to co-ordinate press releases and ensure that information is conveyed accurately and responsibly. BLISS and the Meningitis Research Foundation are also expected to publicise findings to their followers and the general public through their social media channels.

UCL will commence analysing the data as soon as it has been made available in the ONS SRS. It is anticipated that the process of data analysis, interpretation and report writing will take approximately 36 months, with papers submitted for publication in mid to late 2024.

Processing:

The study will involve the following data processing and linkage steps:

1. Infants meeting the Department of Health definition for perinatal brain injury will be identified within the National Neonatal Research Database (NNRD) (cohort 1, n = 54,733). This database contains care data for all neonates admitted to NHS neonatal units across England, Wales and Scotland. Its population coverage is internationally unique with 100% coverage since 2012 and high representative coverage since 2008. The premature infants (< 34 weeks gestation) in cohort 1 will be matched to a comparator group of infants within the NNRD (cohort 2, n = 24,612).
2. The pseudonymised neonatal care data for cohort 1 and 2 will be transferred to the ONS Secure Research Service (SRS) by Imperial College London.
3. Under DARS-NIC-342322-Q1N7M, the NNRD will transfer the minimum identifiers for the NNRD cohorts (1 and 2) to NHS England (NHS number, date of birth, sex and postcode at birth). The NNRD will also provide the birth weight, gestation (from 2015), and multiplicity status (i.e. twins, triplets etc) for the remaining children with gestation time > 34 weeks in cohort 1 to NHS England.
4. The un-matched infants in cohort 1 with perinatal brain injury will be matched in a 1:3 ratio, by NHS England, to a comparator group of infants, identified from Birth Notifications and Civil Registrations (Births) data to create a ‘term’ control cohort (cohort 3, n = 90,363) (DARS-NIC-342322-Q1N7M).
5. All 3 cohorts will be linked to Civil Registrations (Deaths), Hospital Episode Statistics (HES) Admitted Patient Care (APC), HES Accident and Emergency (A&E), HES Outpatients and the Mental Health Services Data Set (MHSDS) up to December 31st 2020, by NHS England. The pseudonymised health outcomes and analysis covariates from the Births products for the three cohorts will be transferred from NHS England to the ONS SRS (DARS-NIC-342322-Q1N7M).
6. Under this Data Sharing Agreement (DARS-NIC-475526-F3Z5H), a file containing a list of personal identifiers (forename, surname, date of birth, sex, and postcodes) for linkage to the National Pupil Database (NPD) will be transferred from NHS England to the Department for Education (DfE). The NPD contains detailed information on the educational attainment, special educational needs and attendance of children at state schools across England between the ages of 5-18 years. A logic model, designed to maximise the chance of a reliable postcode match (given the variation over time), will be used. After linkage, all identifiers will be removed (only the unique study ID number will be retained) and these pseudonymised educational data will also be securely transferred for storage within the ONS SRS.

Identifiable information transferred to the DfE for matching will be controlled through secure access arrangements in line with DfE policy. Access is limited to a team of qualified (permanent DfE staff) data engineers employed on the maintenance and production of the National Pupil Database (NPD). All DfE staff accessing this data are cleared to levels in line with departments vetting protocols and have Baseline Personnel Security Standard (BPSS), Disclosure and Barring Service (DBS) and Level 2 Non-Police Personnel Vetting (NPPV) check clearance. The DfE uses Microsoft Azure cloud hosting for the storage and processing of data, applying a combination of software and hardware controls which meet the ISO27001 standards and the Government Security Policy Framework. The Department’s use of Microsoft Azure hosting has approval from the Cabinet Office and meets all the relevant guidelines for holding and processing personal and restricted data. This includes ensuring the systems comply with Data Protection Legislation and other relevant legislative obligations that apply to data rated at OFFICIAL-SENSITIVE.

Microsoft Limited Azure supply Cloud Services for the DfE and are therefore listed as a processor. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database(s) containing the data. Microsoft Limited Azure servers are located within the EEA.

UCL researchers will only have access to pseudonymised data held within the ONS SRS under DARS-NIC-342322-Q1N7M. In order to access any data in the ONS SRS, all researchers will need to be ONS accredited and undergo data protection and confidentiality training. No data will be held by or at UCL. There will be no requirement or attempt to re-identify participants. Indeed, this would not be possible for UCL.

UCL research staff responsible for conducting the analysis for the project will complete the ONS Researcher Accreditation process, which involves specific training in the safe use of research data environments. They will sign and adhere to the ONS Accredited Researcher Declaration, and will be required to adhere to ONS data protection policies and procedures. All data to be transferred out of the SRS (the results of the analyses) will be checked by ONS staff to ensure that no individual level data, or potentially identifiable data, is transferred. Only aggregate level data with small number suppression will be transferred out of the SRS system for publication.

Data retention
The linkage keys used for the health and educational linkages will be securely held by NHS England and the Department for Education respectively. These will be retained for the duration of the agreement should further linkage be required (this will be requested separately to this version of the DSA). Only the pseudonymised dataset will be retained within ONS SRS to facilitate analysis by the UCL research team.


Loneliness among people with 'Complex Emotional Needs' (CEN): A cross-sectional UK study — DARS-NIC-674976-S4T1V

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-01-23 — 2024-01-22 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Adult Psychiatric Morbidity Survey (APMS)

Objectives:

The objective is to use the Adult Psychiatric Morbidity Survey (APMS) 2014 dataset for research purposes. Specifically, a secondary analysis of the data will be conducted by University College London (UCL) researchers to investigate the relationship between "complex emotional needs" (CEN, or "personality disorder" traits) and loneliness and suicidality outcomes, and the relationship of loneliness to discrimination, self-harm, and suicidal ideation in people with CEN.

Background:

Considering the prevalence and impact of loneliness on both physical and mental health outcomes, recovery, and quality of life, loneliness has been identified as an intervention target with the potential to alleviate symptoms and improve outcomes for people with mental health conditions such as depression, ‘complex emotional needs’ (CEN), and psychosis. The term ‘complex emotional needs’ (CEN) is used to describe people who are diagnosable with a “personality disorder”, who are a neglected group of great research and clinical interest. This is particularly important as the quality of care and available interventions for people with CEN have been criticized as lacking by service users and clinicians.

The research evidence has demonstrated the centrality of loneliness in the day-to-day lives of people with CEN. A quantitative study describing the intensity of loneliness in the lives of people with ‘CEN’ found that poor social functioning and objective social isolation did not account for the severity of loneliness experienced by people with CEN. Not only do these findings illustrate the need to further investigate determinants of loneliness and the effects of loneliness among people with CEN, but they also highlight the probable role of negative and discriminatory societal experiences in exacerbating a sense of loneliness. Recent sociological and psychological theories and studies exploring the relationship between suicidal thoughts and self-harm and loneliness report that a lack of belonging is one of the prominent risk factors associated with suicide and self-harm. Collectively, the evidence points to the need to explore the role of self-harm, suicidal ideation as well as discriminatory experiences and their consequent effects on loneliness outcomes among people with CEN. Although there is a relationship between loneliness and CEN, there is very little known about the relationship between individual CEN traits/symptoms and loneliness outcomes, and whether specific symptoms of CEN are more relevant in relation to loneliness. Given the extent to which interventions focus on symptomatic reductions, investigating whether self-harm and suicidal ideation exacerbate levels of loneliness could provide a pathway for future interventions to target the reduction of loneliness.

Aims and objectives:
In this study, UCL aim to conduct a quantitative study to investigate the relationship between CEN and loneliness and suicidality outcomes, and the relationship of loneliness to discrimination, self-harm, and suicidal ideation in people with CEN. Moreover, UCL will also assess the effects of relevant demographic factors in the relationship between discrimination and loneliness outcomes. UCL will use the APMS 2014 database to explore loneliness in this group and therefore the objectives are to:
- investigate the relationship between the number of traits endorsed that are suggestive of complex emotional needs (CEN), and the presence of loneliness controlling for stressful life events, including serious injuries/assaults, deaths, financial/job problems, prison time, bullying, violence, sexual abuse, homelessness, being institutionalized or in foster care, and other sociodemographic variables.
- Investigate the relationship between the number of traits endorsed that are suggestive of CEN and the presence of suicidality and self-harm, controlling for stressful life events including serious injuries/assaults, deaths, financial/job problems, prison time, bullying, violence, sexual abuse, homelessness, being institutionalized or in foster care and other sociodemographic variables.
- Assess whether loneliness modifies the relationship between the number of CEN traits and suicidality/self-harm by testing interaction effects.
- investigate the association between individual CEN traits and loneliness, controlling for covariates.
- assess the possibility that experiences of discrimination modify the relationship between specific traits suggestive of CEN and loneliness by testing interaction effects.
- investigate the interaction effect between discrimination and sexual orientation, and loneliness (outcome), among people who meet the cut-off for a diagnosis of CEN.


UCL is requesting the Adult Psychiatric Morbidity Survey 2014 (APMS 2014). UK data service (UKDS) (https://ukdataservice.ac.uk/), holding APMS 2014 on the behalf of NHS Digital and are responsible for disseminating under the direction of NHS Digital, would provide the whole data set to UCL as there is no facility to select individual variables. Upon signing the data sharing agreement, UCL would then download the pseudonymised APMS data through UKDS for the period specified within the DSA. All APMS data are pseudonymised. In line with the procedures and standards, when the DSA expires, all local copies of the APMS 2014 dataset will be erased and destroyed. The PI will be dedicating a period of one year to work with the dataset, carry out secondary analysis, and write up results.

The data will be held in the UCL Data Safe Haven using UCL-approved computers. The Data Safe Haven is a highly confidential technical solution for transferring and storing data. It meets the requirements for the NHS Digital governance toolkit and ISO 27001 Information Security Standards.

The aim of this secondary analysis of cross-sectional data from the 2014 Adult Psychiatric Morbidity Survey (APMS) is to investigate the relationship between loneliness and “complex emotional needs” (or “personality disorder”) and suicidality outcomes in a representative sample in England. The APMS 2014, the most recent survey adult psychiatric morbidity survey, provides relevant data on “personality disorder”, loneliness, discrimination, and suicidality for a nationally representative sample of residents in private households in England, aged 16 and over. Therefore UCL requires the full database of this large nationally representative sample to achieve a sufficient sample size to obtain meaningful results for this research. The data obtained will be anonymous and stored and processed only on UCL’s Data Safe Haven. The data will only be accessed by UCL employees and registered UCL Ph.D. students and registered UCL MSc students, as required or necessary. All students sign up to the UCL Academic manual and will be working under appropriate supervision on behalf of the data controller/processor within this agreement , they will have access to the data and only for the purposes described in this agreement.

The data will be processed under Article 6 (1)e- legitimate interest under public task, as UCL is a public authority and processing data for research is one of UCL’s public tasks. Processing the APMS 2014 database to achieve the objective of this research proposed is necessary and there is no other way of fulfilling the purpose of this research. Further, this is the least restrictive way and a safe way of achieving the research goals as individuals will not be harmed through the processing of this data. The data will be processed under Article 9 (2)j - processing is necessary for archiving, research, or statistical purposes. Given the adverse effects of loneliness, with severe levels of loneliness predicting suicidal and self-injurious behaviours in people with CEN, and the prevalence of CEN in the general population, it is of public interest to conduct research on loneliness among people with CEN traits. This research would potentially build the groundwork for intervention development. Specifically, investigating whether self-harm and suicidal ideation exacerbate levels of loneliness could provide a pathway for future interventions to both target the reduction of loneliness, therefore increasing personal recovery outcomes, as well as reducing self-harm and suicidal ideation

UCL ensures that the processing of data abides by means of ‘appropriate technical and organisational measures which includes pseudonymising data/measures.

This research is undertaken as a part of a Ph.D. research and the Ph.D. student conducting this research is sponsored by UCL and the Economic and Social Research Council (ESRC), however, the ESRC will not be involved in the processing of data by the APMS 2014.

There is no control group in this study as the APMS consists of participants (aged over 16 years) who completed the APMS survey.

Expected Benefits:

Improving the quality of care and broadening the range of interventions offered, to include socially focused intervention targets, such as loneliness, has been prioritised and emphasized by people with lived experience of ‘complex emotional needs’ (or a “personality disorder”), professionals, and policymakers. Given the adverse effects of loneliness and the call for a more holistic treatment approach for people with complex emotional needs, as opposed to a narrow focus on self-harm and symptomatic reduction, loneliness is a potentially promising intervention target. To develop an intervention that targets loneliness, an understanding of the relationship between loneliness and ‘complex emotional needs’ and suicidality, as well as investigating the role of discrimination would be useful and necessary. Loneliness, self-harm, and suicide have been a public health crisis and priority, and ‘complex emotional needs’ traits are common in the general population. Therefore, developing evidence-based interventions targeting loneliness among people with ‘complex emotional needs’ would be more beneficial if rooted in a detailed understanding of the associations of loneliness among people with ‘complex emotional needs’. However, pending this aim of developing an intervention, findings from this study can suggest that asking service users with ‘complex emotional needs’ about social connections and loneliness can be beneficial in formulating a treatment plan. If this study reveals a link between loneliness and self-harm/suicidality then this could provide a pathway for future interventions to target the reduction of loneliness, which would consequently contribute to reductions in symptoms of self-harm and suicidality.

The identification of the specific symptoms that could be playing a particular role in exacerbating loneliness could be helpful in developing an effective intervention, with a focus on those specific symptoms exacerbating loneliness. This would give a target in intervention development.
Given the extent to which interventions for people with complex emotional needs focus on symptomatic reductions of self-harm and suicidality, investigating whether self-harm and suicidal ideation exacerbate levels of loneliness could provide a pathway for future interventions to target the reduction of loneliness. This would increase personal recovery outcomes (i.e. loneliness), as well as reduce self-harm and suicidal ideation.

The study will also make a case for the role of loneliness and the way in which it could maintain symptoms (self-harm), which can promote future studies exploring interventions for people with complex emotional needs to include outcome measures such as loneliness.

Assessing the possibility that experiences of discrimination modify the relationship between specific traits suggestive of CEN and loneliness would both be a targetable focus of intervention and would promote clinical discussion on the way loneliness intersects with sexuality, gender, and ethnicity.
The results of this study could also contribute to a better understanding of the complex interplay between loneliness and mental health. The associated/moderators who will be assessed, could shape future questionnaire design for a loneliness scale specific to people with complex emotional needs. This is an issue raised by people with lived experience recently.

Outputs:

The results of this quantitative study will be published in a relevant peer-reviewed journal specialising in psychiatry or personality disorders such as BMC Psychiatry or Journal of Personality Disorder. A jargon-free and accessible blog describing this study in layperson's terms will also be published on the Mental Elf website as a blog for easy access and on other relevant websites such as the Loneliness and Social Isolation Mental Health Network website. This study will also be submitted for a Ph.D. Thesis and presented in research meetings and therefore will be uploaded and freely available in https://discovery.ucl.ac.uk/ upon completion of the Ph.D. Outputs, included in the published papers, will include aggregated level data with small numbers supressed. The date the researcher aims to publish the paper is January 2024. The blog and other mediums of dissemination will be published after the journal accept the manuscript (i.e. March 2024). The results of this research study will be presented at relevant academic conferences with an interest in the link between complex emotional needs and loneliness and suicidality. To increase public awareness the results may also be posted on Twitter. The study results will also be shared in the Loneliness and Social Isolation Research Network Newsletter and other mental health and loneliness-related newsletters. To protect patient confidentiality in publications resulting from analysis of APMS 2014 data users must: guarantee that any output made available to anyone other than those with whom this agreement is made, will meet required standards, including the guarantee, methods, and standards contained in the Code of Practice for Official Statistics and the Office for National Statistics (ONS) Statistical Disclosure Control from tables produced from surveys; and apply method and standards specified in the Microdata handling and Security Guide to Good Practice for disclosure control for statistical outputs.

The results will be disseminated to mental health charities with a focus on targeting loneliness and supporting people with ‘complex emotional needs’ such as befriending networks, and charities such as The British and Irish Group for the Study of Personality Disorder (BIGSPD). This piece of research forms one component of a Ph.D. thesis focused on using mixed methods to explore loneliness among people with complex emotional needs. Therefore, the researcher will contextualize and elucidate quantitative findings within qualitative experiences; therefore, combining subjectivity and complexity of reality with standardised and representative findings gathered through the APMS survey. As well as this study informing policy and potential practice, the results of all these studies will be disseminated in an accessible blog in laypeople's terms, and sensitively, to inform the public (in October/November 2024). The Ph.D. student conducting this research is funded by the Economic and Social Research Council (ESRC) and UCL. This project is a part of a Ph.D. research study.

Processing:

Once the agreement is active, the flow of data will be from the UK data service (UKDS) which will allow access for UCL to download APMS data. There are no other flows of data. Data will only be stored, processed, and held in accordance with UCL’s data protection policy. It will be accessed, held, and stored in the UCL Safe Haven, within the Division of Psychiatry, by the research team. The research team consists of a UCL Ph.D. student, a MSc student (registered solely at UCL), and senior researchers and clinicians who are employees of UCL. The UCL Data Safe Haven has its own set of accepted and standard procedures that is described on the following website: https://www.ucl.ac.uk/isd/services/file-storage-sharing/data-safe-haven-dsh. Data will be stored within the Data Safe Haven, which is characterized by: dual-factor authentication, a firewall that has a default deny policy, data enter via a managed file transfer mechanism and only the asset owner has permission by default to draw down any data. All those analysing data within the team are required to complete the Data Security Training as provided by Health Education England.

All APMS participant data will be included in the data analysis and research study to achieve meaningful results. UCL will securely destroy all local copies of the dataset once the DSA expires and will inform the Data Access Request Service (DARS) as required by standard procedures. This 2014 version of the APMS dataset available via DARS has been redacted based on Disclosure Procedure advice to minimize the likelihood of participants being identified.

Methodology
Sample
UCL is conducting a secondary analysis of cross-sectional data from the 2014 Adult Psychiatric Morbidity Survey (APMS), a survey commissioned by NHS Digital that is carried out by the National Centre for Social Research (NatCen). The APMS provides data on mental health and treatment access for a nationally representative sample of residents in private households in England, aged 16 and over. The APMS uses a stratified probability sampling design that consisted of two stages, the initial stage is an interview with the whole sample and the second stage involves clinically trained interviewers conducting face-to-face interviews with a subset of participants.

Measures
The following clinical measures and sociodemographic characteristics collected in the APMS 2014 survey will be used:
Exposure variable:
- Personality disorder traits: personality disorder traits were measured using the Standard Assessment of Personality- Abbreviated Scale (SAPAS). SAPAS is an eight-item screen for identifying a possible diagnosis of “personality disorder”. The eight items are the following: 1) difficulty making and keeping friends, 2) identifying self as a loner, 3) difficulty trusting others, 4) tendency to lose temper, 5) tendency of being impulsive, 6) tendency to worry, 7) tendency to depend on others, and 8) tendency to be a perfectionist. The SAPAS has been validated for use in the psychiatric population and general population. SAPAS scores correlate with nurses’ ratings of externalizing behaviours, and future functioning and response to antidepressant treatment. A cut-off point of 4 on the SAPAS indicates a high probability of a diagnosis of a “personality disorder”.

Outcomes

- Loneliness: Loneliness is assessed using one item from the eight-item Social Functioning Questionnaire, a valid and robust measure to assess social functioning. Participants were asked to what degree they agreed with the statement: “‘I feel lonely and isolated from other people’’, over the past two weeks. Participants indicated their agreement on a four-point Likert scale, ranging from ‘not at all’ (scored 0) to ‘almost all the time (scored 3)’, generating scores ranging from 0 to 3. To transform these scores and avoid including zero values, one will be added to each integer.

- Self-harm, suicidal ideation, and suicidal attempt:
- Self-harm: Questions on self-harm were asked in both the self-completion section and the face-to-face interview. Information about self-harm was obtained through questions: 1) Have you deliberately harmed yourself in any way but not with the intention of killing yourself? A variable will be created on self-harm in the past year, derived from either the self-completion section or the face-to-face section.

- Suicidal ideation: In the face-to-face interview questions on suicidal ideation was included and participants were asked: 1) Have you ever thought of taking your life, even if you would not really do it? An affirmative response would be followed by the following questions on when this last occurred. A variable will be created on suicidal thoughts in the past year. This approach has been used by other studies.

- Suicide attempts: Questions on suicide attempts were asked in both the self-completion section and the face-to-face interview. Data on suicide attempts was obtained by the question: Have you ever made an attempt to take your life, by taking an overdose of tablets or in some other way? A follow-up question on when this took place was asked. A variable will be created combining reports of suicidal attempts in the past year, in either section (i.e. the face-to-face or the self-completion section)

Potential modifiers of the association between the exposure and outcome:
- Discrimination
o The APMS includes binary measures on experiences of discrimination, based on a computer-assisted self-interview. Participants were asked a series of questions on whether people have been treated unfairly in the past year on the basis of belonging to a particular group such as their mental health condition, ethnicity, sexuality, sex, age, religious beliefs, and physical health problems or disabilities. UCL will be focusing on discrimination based on sexuality and mental health condition.

Covariates
The following sociodemographic characteristics collected and recorded in the APMS 2014 survey are covariates that might be associated with both the exposure and the outcome and therefore confound any association. Sociodemographic variables such as gender, age, Registrar General's Social Class, and marital status are all measured and recorded and included in a set of covariates and UCL will agree on these as a team.

- Gender
- Age
- Socioeconomic status
- Marital status
- Social participation
- Stressful life events
- Childhood emotional, physical, and sexual violence or abuse


Analysis plan

UCL will conduct all analyses using STATA (a statistical software) and present descriptive statistics, sociodemographic and clinical characteristics, by exposure group, and will divide the exposure group into three categories for the descriptive statistics, those scoring 0, those scoring 1-3, and those scoring a 4 or more (meeting the criteria for a probable diagnosis).

The relationship between the number of traits suggestive of CEN (continuous measure) and loneliness (binary measure) will be investigated using logistic regression models. UCL will then conduct adjusted models for the confounders described, adding blocks sequentially. Similarly, logistic regression models will be run to produce odds ratios for outcome variables, suicidality and self-harm, UCL will then assess if loneliness modifies this relationship between CEN and suicidality and self-harm by testing the interaction between the number of CEN traits endorsed and loneliness.

To assess whether experiences of discrimination modify the relationship between specific traits suggestive of CEN and loneliness, UCL will be testing for interaction effects between each trait of CEN and the modifier. UCL will carry out a separate statistical analysis to investigate the interaction effect between discrimination and sexual orientation, and loneliness outcomes.

UCL will apply appropriate survey weightings to the data, using the relevant ‘survey’ (svy) commands in Stata 15, which allow for the use of clustered data modified by probability weights and provide robust estimates of variance. This weighting will take account of the complex survey design and of non-response to ensure that estimates are representative of the household population in England.

No linkage to other data will be conducted. No linkages are requested at this time. There will be no attempt to re-identify any participants


Modelling impact of interruptions to cancer screening with COVID ( ODR2021_016 ) — DARS-NIC-656876-L4B0V

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-02-15 — 2024-07-06 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. NDRS Cancer Registrations

Objectives:

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). PHE facilitated data release via its Office of Data Release service (ODR). ODR was responsible for providing a common governance framework for responding to requests to access PHE data for secondary purposes, including service improvement, surveillance and ethically approved research. All requests to access data were reviewed by the ODR and were subject to strict confidentiality provisions. The responsibility for the management of the National Disease Registration Service of which the National Cancer Registration and Analysis Service is a part, transferred from PHE to NHS Digital (Now NHS England) on 1st October 2021.


This project has three aims:

1. Estimate what impact delayed diagnosis and delayed treatment – between one month and one year – will have on:
a. The number of cancers that progress to a more advanced stage by the time they are diagnosed, and
b. Survival from cancer (specific cancers to be investigated are included in the data specification).

2. Model the impact of disruptions to breast cancer screening and identify strategies that could be used when re-starting screening that minimise any harms resulting from such disruption.

3. Predict the demand for diagnostic, treatment, and screening services.

Yielded Benefits:

Our modelling has shown interesting results on the impact of delays, but we have not yet finished our analyses.

Expected Benefits:

As previous.

Outputs:

The outputs of this project is modelling to show the impact of different lengths of delays on stage and outcomes at diagnosis. This is relevant for both post-COVID planning, and for cancer health policy more generally.

We are applying for an extension due to personnel changes that have slowed our progress.

Processing:

Design of study: Mathematical modelling. Study population: All adults aged 18 and over in England with one or more of lung, colorectal, prostate, breast, pancreatic, oesophageal, liver, bladder, kidney, or ovarian cancers. Statistical analysis: This analysis will have three overarching stages: 1) Stage progression. We will analyse retrospective cancer registry data to derive the time taken for each cancer type being investigated to progress between stages.

We will apply for access through Public Health England's (PHE) Office for Data Release to Cancer Registry data for all patients aged 18 or older in England diagnosed with one of: lung, colorectal, prostate, breast, pancreatic, oesophageal, liver, bladder, kidney, or ovarian cancer diagnosed between 01/01/2013 and 31/12/2017.We will subsequently apply for the Public Health England’s rapid cancer registration dataset providing data from 2018 up to the present.

Using depersonalised data on age at diagnosis, the year their cancer was diagnosed, the type of cancer, its stage at diagnosis, and year of death (if applicable), we will use a Markov multistate model to enable us to estimate the time taken for each cancer to progress between stages. We will then apply the resulting stage transition estimates to incidence and stage-at-diagnosis data for each cancer to derive the number of cancers expected at localised / advanced stages at different periods of time, under alternative lengths of delays to cancer services.2) Modelling breast cancer screening. We will use a multistate model taking into account the natural history of breast cancer to derive the probability of a cancer being detected by screening or clinically with different periods of disruption to screening programmes. We will apply this probability to a decision analytic model that uses a life-table approach to understand the impact on cancer outcomes. We will analyse alternative catch-up screening strategies to identify that which best mitigates the disruption on breast cancer mortality and life-years gained.3) Estimating impacts of delays and demand for cancer services We will use the estimated number of cancers at different stages to analyse the impact of delays to diagnostic and treatment services on cancer outcomes and demand for services in England. Treatment parameters by stage will be obtained from PHE's Cancer Registry data. All other parameters for the models will be from aggregated anonymous sources, for example those released under an Open Government License, or peer-reviewed literature.


We will first develop a continuous time Markov multistate model to describe the progression of cancer through the following states: healthy, localised cancer, advanced cancer, and dead1,2. Using these probabilities, and incidence of cancer by stage, we will estimate the expected number of additional advanced cancer diagnoses and the expected number of localised and advanced cancers at different periods of time. To analyse the impact of disruption to the breast cancer screening programme and alternative catch-up screening strategies that could be used when re-starting the programme, the following methods will be applied:1.We will use a multistate model of the natural history of breast cancer in the preclinical phase to derive the probabilities of detecting cancer by screening or clinically (i.e. interval cancers) following different time periods of disruptions to screening services. 2.Using a life-table approach3,4, accounting for the sojourn time of breast cancer by age, stage, and subtype, and using the derived probabilities for screen-detection and clinical diagnoses, we will model the impact of a suspension of screening services and of the backlog on interval cancer diagnoses and subsequent life years lost. We will consider different catch-up scenarios and identify the scenario that gives the fewest interval cancers and the least loss of life years. 3.We will consider delays in screening of between one month and one year and will liaise with PHE screening regarding alternative re-starting strategies that are under consideration. Finally, to analyse the impact on cancer outcomes and on predicted demand for cancer diagnostic, assessment and treatment services we will apply data on diagnostic and treatment modalities by cancer stage to estimate demand for services, and aggregate data on 1-year and 5-year survival to estimate impact on survival and mortality.

Our statistical analysis plan has been chosen as it encompasses robust methods with which the study team has experience for predicting medium and long-term cancer outcomes. There are two major caveats to the quality of the rapid cancer registration data for our analyses: missing variables, and data inaccuracies. For the purposes of our study, we have focussed on ten more common cancers, for which missing variables and data inaccuracies in the rapid cancer registration data are less of a problem than in rarer cancers or cancers of unknown primary. In both cases, earlier data are more accurate than the most recently available months, allowing us to take the inaccuracies into account in our modelling. Importantly, in our analyses our focus is on broad TNM stage as early/advanced for cancers as a whole (e.g. breast cancer, rather than breast cancer subdivided by hormone receptor) such that the impact of missing details, for example of cancer subtype, is limited. In addition, these data remain the best possible within the current context and we feel their use is justified given the role our results may be able to have in supporting policy and planning decisions as the pandemic continues to develop.


Stratifying Genomic Causes of Intellectual Disability by Mental Health Outcomes in Childhood and Adolescence (IMAGINE-2) — DARS-NIC-168879-K2N8Q

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable, Anonymised - ICO Code Compliant, No, Yes (Mixture of confidential data flow(s) with consent and non-confidential data flow(s))

Legal basis: Health and Social Care Act 2012 - s261(5)(d); Health and Social Care Act 2012 – s261(2)(a); Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2023-01-09 — 2026-01-08 2023.07 — 2024.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, System Access
(System access exclusively means data was not disseminated, but was accessed under supervision on NHS Digital's systems)

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Emergency Care Data Set (ECDS)
  2. Hospital Episode Statistics Accident and Emergency (HES A and E)
  3. Hospital Episode Statistics Admitted Patient Care (HES APC)
  4. Hospital Episode Statistics Critical Care (HES Critical Care)
  5. Hospital Episode Statistics Outpatients (HES OP)
  6. Mental Health of Children and Young People (MHCYP)

Objectives:

The IMAGINE-2 study is a medical research study funded by the Medical Research Council (2020-2024) that aims to investigate the impact of genetic disorders that are associated with learning difficulties on children and young people’s mental health. It is a collaboration between University College London (UCL) and Cardiff University. Cardiff University will not have access to or process the NHS Digital data to be provided for this UCL data request. Cardiff University do not determine the purpose or the means of the data processing for the IMAGINE-2 study and are not therefore considered to be a data controller. The University of Cardiff Investigator has had no input on determining the purpose and means of workstream 1.

The UCL study team resides at the UCL Institute of Child Health department which is a joint research office between UCL and Great Ormond Street Hospital (GOSH). As such, both UCL and GOSH logos are used in the materials for this study however GOSH does not play any further role in the study and do not determine any purposes of this study in any capacity.

The IMAGINE-2 study is a follow-up project of the previous one called Intellectual Disability and Mental Health: Assessing Genomic Impact on Neurodevelopment (IMAGINE-ID) which was funded by the Medical Research Council (MRC) and Medical Research Foundation (2015-2020). A collaborator from the University of Cambridge who was involved in the IMAGINE-ID study is involved in IMAGINE-2 as a consultant only who may provide advice to the IMAGINE-2 project. The University of Cambridge will not have access to any newly collected data from the research programme and do not determine the purpose or the means of the data processing for IMAGINE-2. The MRC as funders of IMAGINE-2 do not determine the purpose or the means of the data processing and will not process any of the study data. These organisations are not therefore considered to be data controllers or data processors.

IMAGINE-2 is divided into two workstreams. Workstream 1 aims to map trajectories of developmental risk for individuals with different types of Intellectual Disability (ID). Workstream 1 is led by University College London and requires NHS Digital data. Workstream 2 is led by Cardiff University and involves a face-to-face follow-up study of young people seen during IMAGINE-ID who have been identified as carrying a genetic variation which is high-risk for mental health problems. Cardiff University will independently collect data from participants in IMAGINE-2 by direct contact with the identified high-risk subset of families whom they will visit at home. Cardiff University will not have access to the NHS Digital data.

UCL is the sole data controller who also processes NHS Digital data for the IMAGINE-2 study. NHS Digital data will be handled exclusively by UCL in the Data Safe Haven (DSH) to which only staff who are associated to UCL will have access. Access to UCL DSH will only be given to individuals who have had the appropriate UCL non-disclosure training and have signed a Non-Disclosure Agreement. No data will be exported outside the DSH.

UCL is a ‘public authority’, as defined in the Data Protection Act 2018, with a principal object of the organisation being research and its dissemination. The processing of identifiable personal data, including special category data, is necessary to carry out medical research that serves the public interest. As such, the legal bases for processing personal data are:
Article 6(1)(e) of the GDPR, ‘processing is necessary for the performance of a task carried out in the public interest’; and
Article 9(2)(j) of the GDPR ‘processing is necessary for archiving purposes in the public interest, scientific or historical research purposes’.

Section 8 of the Data Protection Act 2018 clarifies that “In Article 6(1) of the GDPR (lawfulness of processing), the reference in point (e) to processing of personal data that is necessary for the performance of a task carried out in the public interest or in the exercise of the controller’s official authority includes processing of personal data that is necessary for… (d) the exercise of a function of the Crown, a Minister of the Crown or a government department”. University College London has an established royal charter. It includes the following statement “The objects of the College shall be to provide education and courses of study in the fields of Arts, Laws, Pure Sciences, Medicine and Medical Sciences, Social Sciences and Applied Sciences and in such other fields of learning as may from time to time be decided upon by the College and to encourage research in the said branches of knowledge and learning and to organise, encourage and stimulate postgraduate study in such branches.

As a higher education establishment, the University conduct research to improve health care and service and the linkage requested is necessary for the performance of a task carried out in the public interest; i.e. improving the health outcomes of children with genetic disorders.


The IMAGINE-2 cohort consists of children and young people who were born between 1989 and 2016, and have intellectual disability (ID) or developmental delay caused in whole or in part by a known genetic variant CNV (copy number variant); or SNV (single gene variant). The study aims to delineate the course and outcomes in CNV-associated intellectual disability (ID) and single gene disorders to provide information at the point of diagnosis and onwards for families, clinicians and service providers, as well as to pave the way to greater biological understanding and the personalisation of interventions. Several recurrent ID-associated CNVs and single gene disorders have been associated with poor mental health outcomes., However, there is considerable pleiotropy (i.e. variations in a single gene may affect multiple (possibly unrelated) observable characteristics of an individual), and also incomplete penetrance for specific psychiatric diagnoses (i.e. some individuals express the associated symptom or trait while others do not, even though they carry the disease-causing gene).

The cohort includes children with genetic disorders that put them at risk of autism, attention deficit hyperactivity disorder, anxiety and psychosis among other conditions. They are also at risk for non-psychiatric disorders including sensory impairments and epilepsy. No study to date has deployed systematic sampling and assessment to determine why some, but not all, ID-related CNVs and single gene disorders are associated with poor mental health outcomes, nor have they identified risk and resilience factors modifying outcomes across this population. Assessing the relative contributions of CNV and/or SNV genetic constitution, ID severity, cognitive profile, social/ environmental risk factors, and physical comorbidities, will highlight major determinants of adjustment. Better care could be provided if those individuals at greatest risk were identified early and if preventive intervention was timely and focused on salient biological and/or social processes. Early identification has the potential to reduce the costs of long-term care, better target key services/interventions, and improve quality of life over the life-course.

Background:
Participants were eligible for the IMAGINE-ID study if they were 4 years of age or over at the point of recruitment between 2015-2019, and if they possessed a genetic variant, identified by an NHS Regional Genetic Centre, that was reported to be causing intellectual disability/ developmental delay. The vast majority of participants in IMAGINE-ID were identified as being eligible by one of 25 UK Regional Genetics Centres (RGC). The original genetic testing was ordered on the basis of unexplained learning disabilities. Eligible families were invited to participate by the paediatric team linked to the RGC. The study was advertised to patient groups through social media and at parent-supported events. Once participants had been recruited, they were invited to complete online assessments of their child’s mental health, behaviour and well-being for the Workstream 1 data collection. 3402 participants were recruited in total. 500 families, a subset of the total sample, have been seen face-to-face for more detailed evaluations by Cardiff University collaborators for Workstream 2 of the study. The new MRC grant (2020-2024) provides funds for further study and the title has been changed from “Intellectual disability and mental health: Assessing genomic impact on neurodevelopment (IMAGINE ID)” to “Stratifying Genomic Causes of Intellectual Disability by Mental Health Outcomes in Childhood and Adolescence (IMAGINE-2)” which will follow up the already-recruited cohort for 54 months, commencing 1st April 2020.

The study has established a patient, parents and carers consultation group. This group has been consulted from the inception of the study and is regularly updated. The group was established to provide feedback, comments and suggestions that have influenced the design and progress of the project.

The Workstream 1 data collection undertaken during IMAGINE-ID provided details of the children’s development, well-being, mental health and adaptive functioning. A brief account of the children’s medical history was obtained. The Adaptive Behaviour Assessment System (ABAS-3) was used to estimate the degree of developmental delay in key domains of adaptive functioning (e.g. language, self-care, motor skills). Children’s mental health was assessed by the Development and Well-Being Assessment (DAWBA), which has been employed in three national UK studies over the period January 1999-December 2017. The DAWBA is a detailed semi-structured interview and covers many areas of development, behaviour and well-being. Rates of mental health disorder and behavioural/ emotional dysfunction in the IMAGINE-ID cohort can therefore be directly compared with a national representative sample of typically developing children and young people, the dataset of the Mental Health Children and Young People (MHCYP) from the UK Data Service. Significant general health problems are described in many cohort children with mental health disorders.

Mental Health of Child and Young People (MHCYP) data request:
MHCYP data is requested for use as the control data to compare with IMAGINE-ID cohort data collected from the assessments of mental health, behaviour and wellbeing (i.e. Workstream 1 data collection). The Mental Health of Children and Young People (MHCYP) survey provides record-level, pseudonymised data on the prevalence of mental disorders in children and young people (aged 2-19 years old) living in England. This dataset contains the same measures as the IMAGINE-ID research data and covers a similar age of children and young adults. No identifiable data is requested from the MHCYP dataset.

HES data request:
The study is applying for access to Hospital Episode Statistics (HES) data to assess the broader healthcare needs of this group as a function of their genetic disorder. Pseudonymised data is requested from NHS Digital but will be linked to existing study data thus making the data technically identifiable. Access to these data will permit a more detailed view of the strengths and difficulties of participants, their use of services, and comorbidities. In the IMAGINE-2 study (2020-2024), a longitudinal perspective will be taken, throughout childhood and adolescence, to ascertain disease trajectory and outcomes relating to social inclusion, education and their health needs. Better characterisation of the trajectories of health risks and wider developmental impact of these diverse ultra-rare genetic disorders will inform and improve future healthcare and management.

The request is to access Hospital Episode Statistics (A&E visits, Critical Care Episodes, Admitted Patient Care Episodes and Outpatients appointments) and Emergency Care Data Set for the IMAGINE-2 cohort and a control cohort. By linking HES and ECDS data to already collected data on the IMAGINE-2 cohort mental health and family circumstances, the study aims to build a detailed picture of the more significant medical healthcare needs of this group of children and young people. The HES control cohort requested by UCL will be matched on age, sex and index of multiple deprivation.

Through analysis of the number of hospital visits, number of outpatient appointments, and length of stays in hospital, the study will examine to what extent the population of children and young people with intellectual disability or developmental delay caused by a known genetic variant relies on the NHS healthcare system more than typically developing children. The study will investigate the costs involved in caring for these children, including the costs to the children themselves (for example missing school because they have hospital appointments or are admitted to hospital). The study will examine potential links between specific ultra-rare genetic disorders and the need for specialist intervention in particular healthcare domains. For example, one third of the cohort has had seizures; in some cases, the seizures were associated with genetic anomalies that have never previously been studied in detail because of their rarity. It would not be possible in any other way to analyse cohort data at this scale, which has implications for improving future medical management in this vulnerable population. Preliminary data, from parental reports, indicates a high rate of frequent users, reflecting the complex nature and needs of many of these disorders.

Separately, UCL also require access to a standard extract of Mental Health of Children and Young People (MHCYP) 2017 and 2020 survey data. This data is not linked (nor does it have the capability to do so) to either UCL’s cohort nor the control cohort. This data will be used for comparison purposes to give UCL a snapshot of statistics into several categories (mental health, behaviour and well-being).

UCL hypothesise that developmental trajectories of children and young people in the IMAGINE-2 cohort with genetic differences will differ to those of individuals without genetic disorders, and that this will be best captured by clusters of traits indicating their mental states, mental illnesses or disorders and impairment of their cognitive development. UCL also hypothesise that the trajectories of these clusters of traits will be impacted by genetic factors and related risk factors such as socioeconomic adversity and family environment. A control cohort matched by age, sex and geographic region will indicate the differences in impact according to genetic factors and index of multiple deprivation for example.

Data linkage:
The study aims to link HES data to the existing research data from Workstream 1 in order to obtain a detailed picture of participants’ use of NHS secondary care services. The data held from Workstream 1 are genetic, medical history and mental health data of which is coded and is in non-identifiable form. The genetic data (from Regional Genetic Centre (RGC) laboratory reports) and observable individual characteristic data (through online psychiatric assessments) will be linked to medical history data in order to build a highly detailed picture of the cohort as it develops over a period of 5 years since participating families were originally recruited and interviewed. The study will compare service usage in this cohort of children to service usage by children and young people in England.

Using the diagnosis categories for each episode, the study will be able to assess the health problems that are associated with each genetic disorder and which are common across genetic disorders. The study will ascertain if there are health problems in common within and across genetic disorders which are not yet well-known or described in the literature and will contribute to the existing body of knowledge.

Using HES data about length of episodes and specialty involved, the study aims to further develop analyses of socioeconomic factors involved in these genetic disorders. This will have two outcomes; to provide information about the level of contact with health services parents might expect if their child is diagnosed with a genetic disorder, and to provide an assessment of the cost to the NHS of caring for this group of children (using NHS Reference Costs), which will be of use in decisions relating to commissioning services. Prevalence of neurodevelopmental disorders is increasing as more children survive due to better care, and with the constant development in genetic sequencing technology, more and more children in the future will have a genetic cause of their developmental delay or intellectual disability identified. Early identification and better understanding of the disease trajectory of these conditions has the potential to reduce the costs of long-term care, better target key services and interventions, and improve quality of life over the life-course.

UCL will conduct network analyses and machine learning methods to look for commonalities which could indicate the mechanism by which specific genetic disorders are associated with medical/psychiatric disorder expression and provide paths of investigation for therapies. In regards to machine learning, this is a way of statistical analysis which may be used in this study to analyse the research and HES data. Any machine learning analysis will not involve any personal nor identifiable data. Researchers will use various statistical methods in conducting these analyses, such as regression, classification and machine learning methods, time series analysis, statistical inference and natural language processing, according to the approach that is most appropriate for the data.

In order to undertake the analyses as described, the study is applying to access HES A&E, Outpatients, Critical Care and Admitted Patient Care data and ECDS for each of the consented participants, covering as much of their lifespan as is available for each dataset (a range from 1994/95 to present) in order to gain a detailed and accurate picture of their medical history and use of services throughout their lifetime. As the data relates to individuals, the geographical spread of the data requested represents the current geographic spread of participants in England.

The study is requesting the medical history of each participant. There are no alternative or less intrusive ways of obtaining these data. Whilst the study has obtained, for approximately one third of participants, a basic medical questionnaire and a brief medical history, much more detailed data are required. It would be impractical to ask families to recount every healthcare interaction they have had. Obtaining data directly from primary or secondary care providers would not be feasible due to time and complexity, and the size of the cohort.

In order to minimise the data requested, the study is requesting very limited demographic data (limited to demographics relating to health care, such as Integrated Care Board/Trust names). No variables have been requested which are not necessary for the proposed analyses.

Yielded Benefits:

This is a new request for NHS Digital data. No yielded benefits have been attained to date using NHS Digital data. Information on the benefits of the wider IMAGINE-ID study can be found at https://imagine-id.org/

Expected Benefits:

In England, there are over a million people with learning disabilities, a quarter of whom are children of school age. Most moderate to severe intellectual disability (ID) has a genetic cause. The study hopes to have beneficial impacts upon the domains of clinical practice and care services, and quality of life for affected families. Potential benefits to health and social care include better understanding of the mental health and well-being of children with ID caused in whole or in part by a known genetic variant, with the aim of enabling more efficient targeting of healthcare resources to provide the best support to them.

Clinical practice: Clinicians in the NHS increasingly request specialist genetic investigations for children with ID. Usually, the results do not translate into specific recommendations for management or prognosis relating to behavioural adjustment, although families would welcome such knowledge . The patient, parents and carers consultation group gave feedback and comments that they highly supported more research on the genetic investigations for children with ID, in order to gain more understanding and knowledge of the conditions. With the genotypic (genetic constitution) data on specified genetic disorders with standardised phenotypic (observable individual characteristic) data and longitudinal health service utilisation, this study expects to generate valuable information of the mental health and well-being of this cohort. Identification and characterisation of the trajectory of these rare disorders will be used to provide information to clinicians and health professionals who see patients with these disorders. Better evidence-based information is expected to help clinicians and families in making appropriate health and social care decisions and aid in assessment of whether to undertake interventions or prescribe medication to ameliorate symptoms. If children are able to access better care or take advantage of adjustments or interventions, this may improve outcomes (e.g. better educational attainment, improved social communication and inclusion) which in the long term can result in fewer interactions with mental health care providers to alleviate some of the pressures on the resources of the healthcare system..

Families: Unusual behaviour patterns or emotional disorders associated with ID are often ascribed to inappropriate parenting practices. Recognising common disorder-specific patterns is the first step to reassuring parents and educating clinicians/social support staff. This is expected to reduce self-blaming and stress, with resultant improved quality of life for affected families. The impact of child behaviour can be reliably and easily measured across time, and may independently predict future symptoms and psychiatric disorders, including the interactive process by which behavioural and emotional problems can undermine family/individual quality of life. Through documenting health service utilisation in conjunction with genotypic and phenotypic detail, new opportunities for intervention could arise, thus enhancing parents' economic activity (e.g. by promoting their mental health, reducing school exclusions, limiting risk of parental separation).

Health Care Professionals: Intellectual disability implies global impairments in cognitive skills, yet some developmental trajectories may be preserved (exemplified by the relatively good language skills of children with the genetic disorder Williams syndrome). Gaining knowledge about differences in ability across different genetic disorders has implications for education planning and fostering the maximisation of individual potential. Such discoveries could inform policy on the management of children with ID due to a genetic cause. Information on environmental factors influencing emergence of challenging behaviour linked to genotypic risk could point to genotype specific interventions, reducing risk of transfer to residential care and the associated costs.

These benefits will impact not just the NHS in terms of better evidence for clinical care decisions, but also every family which includes a child with intellectual disability or learning problems due to a genetic cause, both now and in the future. Through the outputs resulting from the analysis of the requested data, we aim to deliver benefits over the next 5 years (2022-2027) initially. UCL and Cardiff University will benefit from the publication of peer-reviewed journal articles which will contribute to further funding applications and research collaborations both nationally and internationally.

Outputs:

The data will be used to create analytical outputs which are subsequently intended to be used in research reports and published in peer-reviewed journals and/or presented at academic conferences appropriate for the nature of the analysis and message.

All outputs of the data analysis will be aggregated and small numbers will be suppressed in line with the HES Analysis Guide. The study will ensure that NHS Digital data will not be linked to any other data which would be likely to make it identifiable.

The audience for this research is primarily clinicians and researchers. However, the research teams are keen to disseminate findings to participants in the study and to the general public. The study will work with patient groups such as UNIQUE (Understanding Rare Chromosome and Gene Disorders), a registered charity which provides information and support to individuals with various genetic disorders, and raises public awareness of the conditions. The outcome of this study may include high-level insights gained through analysis of MHCYP and the IMAGINE research data to provide more information and understanding on the genetic disorders for the affected individuals and the public. UCL aim to submit for publication within 24 months of receipt of the dataset.

Planned disseminations include:

Publications: The study will use open access publication, in line with current Medical Research Council policy, to maximise the impact of peer reviewed publications. Publications will be targeted at a number of different academic and clinical audiences. UCL intend to reach out to the community of researchers working with intellectual disability through specialist publications. A broader readership will be targeted in academic psychiatry that is interested in specific findings of more general relevance through more academic publications, and higher impact journals if appropriate. Journals to be targeted include: The Lancet Psychiatry, British Journal of Psychiatry, Journal of Developmental Disorders, Journal of Intellectual Disability Research, American Journal of Psychiatry, Journal of Health Services Research and Policy, Archives of Disease in Childhood and the British Medical Journal. Both the principal investigator (PI) and the co-PIs have strong track records in these areas, including publications in a range of high impact journals.

Conferences: Researchers intend to present findings at academic conferences and other forums relating to research in neurodevelopmental disorders and behavioural phenotypes, genetics, and rare diseases in the UK, Europe and North America, including: International Society for Autism Research (INSAR), Neurodevelopmental Disorder Annual Seminar, UCL Mental Health symposium, Society for the Study of Behavioural Phenotypes Conference, Royal College of Psychiatrists Conference.

Clinical community: A main objective is to provide clinically valuable information to clinicians from a wide range of specialities, including community paediatricians, clinical geneticists, neurologists (both paediatric and adult) as well as specialists working with intellectually disabled people. The study will target these groups by presenting findings at clinically oriented conferences. The study intends to publish summaries of the findings in wide circulation journals of general interest to practitioners, including family doctors. The study will aim to ensure that its findings are distributed to clinicians and others working professionally with ID through contacts with appropriate professional and specialist Societies, including: Royal College of Paediatrics and Child Health (through publication in the Archives of Disease in Childhood); Royal College of Psychiatrists (through publication in the British Journal of Psychiatry); Members of the British Medical Association (through publication in the British Medical Journal).

Other Intellectual Disability Stakeholders: The study has worked closely with the former Chief Executive Officer (CEO) of the charity UNIQUE to advise upon recruitment of families to the study. The study is also working closely with a wide range of other parent support organizations for children with specific ID-related genetic disorders such as SWAN ('Syndromes without a Name'). This liaison ensures there is a gateway directly into the community of parents and carers of individuals with intellectual disability for whom much of our research will have direct relevance. The study has established a newsletter and a website (https://imagine-id.org/) in collaboration with patient and public involvement (PPI) groups, to make its findings available to all stakeholders with an interest in the research, and its implications for the ID community.

Wider audiences: UCL and Cardiff University are keen to promote public engagement in science at many levels. Research staff, at all sites, participate in outreach activities such as attending conferences held by support groups. Communication of the project outcomes to the general public will take place during the lifespan of the project via on-line announcements (e.g. Twitter and Facebook), as well as meetings with journalists from the popular scientific press and general press. Presentations at symposia and public lectures will broaden the impact of the research on the public. The study also intends to make available information for families in collaboration with patient-support organisations and other stakeholders.

UCL estimate outputs will start to show 6 months on receipt of the data from NHS Digital.

Processing:

Data:
UCL are requesting Mental Health of Children and Young People (MHCYP) 2017 and 2020 survey data via the UK Data Service (UKDS). This will enable comparison of the mental health, behaviour and well-being of the IMAGINE-2 participants with intellectual disability (ID) to the MHCYP cohort in the general population.

UCL are also requesting HES APC, HES CC, HES OP, HES A&E and ECDS data for IMAGINE-2 participants and a matched control cohort identified by NHS Digital who do not have recorded ID. UCL will provide NHS Digital with the NHS number, date of birth, postcode and gender of IMAGINE-2 participants on one occasion alongside a unique study Identification. HES data will be returned to UCL with the unique study Identification only. An equivalent pseudo-Identification will be generated for the matched control cohort.

Processing:
The MHCYP data is only available by system access via the UK Data Service (UKDS) and the data will be processed in their secure environment. The MHCYP data will not be linked to neither cohort data. In order to protect patient confidentiality in publications resulting from analysis of MHCYP data users must apply the following rules:
· zeros should be shown,
· 1-7 to be rounded to 5,
· any other numbers rounded to nearest 5,
· rounding unnecessary for averages etc.,
· percentages calculated from rounded values,
· if zeros need to be suppressed, round to 5.

The HES data will be stored and analysed securely within the UCL Data Safe Haven environment. Once the data are received, the IMAGINE-2 research team at UCL will perform exploratory data analysis and clean data as required (e.g. by removing or flagging missing data, subsetting the data for ease of analysis and any other necessary processing in order to make the data ready for analysis). The study intends to link records in the IMAGINE-ID database regarding details of the cohort’s development, well-being, mental health and adaptive functioning to HES data for this project specifically. This will be done using each cohort member’s unique study ID. There will be no requirement or attempt to re-identify individuals.

All research data held in the Data Safe Haven for analysis is kept separate from the identifying data files (also stored in the Data Safe Haven).. These data are kept in a separate secure system within Data Safe Haven. The research data and identifiable data will not be linked, with the exception of Date of birth and postcode for demographic analysis purposes if required. The raw data will not be transferred out of the Data Safe Haven. At the stage of creating publications or presentations, only aggregate and summary data with small numbers suppressed in line with the HES Analysis guide will be transferred out of the Data Safe Haven. All data will be processed by UCL.

VIRTUS LONDON 4 do not access data held under this agreement as they only supply the building. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database(s) containing the data.

The data will be stored for use as part of the research analysis carried out by the IMAGINE-2 study only. It will not be available at record level to third parties and will not be available for any commercial use. Access to the IMAGINE ID data within the UCL Data Safe Haven is controlled by the IMAGINE ID Principal Investigator at UCL. Only authorised users have access and access is via 2-factor authentication (username, password and authentication code).

Under this Agreement, the data will only be processed by substantive employees of UCL and those with access to the data have taken information governance training and are aware of their responsibilities and obligations.


Cancer Registry-wide study in infants with neuroblastoma; Task 11.4 of the ENNCCA Network of Excellence (ODR1516_119) — DARS-NIC-656760-F8Y3C

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-01-01 — 2023-12-31 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. NDRS Cancer Registry
  2. NDRS Cancer Registrations

Objectives:

The ENCCA (European Network for Cancer Research in Children and Adolescents, www.encca.eu) Network of Excellence aims to accelerate clinical and translational research in paediatric and adolescent oncology and to promote evaluation of and access to innovative therapies. The ENCCA network of 34 partners spans 11 European countries and includes 27 eminent paediatric oncology institutions. ENCCA links all the multinational clinical trial groups and national childhood cancer professional societies across Europe in an ENCCA European Clinical Research Council. It is structured as a consortium that will carry out 18 working packages. Work package 11 aims to establish methods of linkage of the population-based cancer registries with other forms of routine health care data to conduct future research in childhood cancers where the overall population has a good prognosis. Work package 10 aims to develop risk-adapted therapies in solid tumours, mainly neuroblastoma.

Primary aim: To understand the outcomes of current treatments for neuroblastoma in infants in relation to the success of first-line therapy (event-free survival) and the burden of treatment received by the individual child and reasons for any differences between countries.

this project will develop mechanisms and methods of collaborative work between the population-based cancer registries and the clinical databases across the participating European countries and clinical registries. The aim will be to link the series of cases arising in a well-defined (by age at diagnosis) population of infants with neuroblastoma and registered in cancer registries, enhanced with the detailed information held in the clinical databases/hospital records at the patient's treatment centres.

The objective is to understand the reasons for the observed decline in overall survival rates for infants diagnosed with neuroblastoma in England in the 2000s- hypotheses that will be explored include whether it is due to poorer compliance with the international 'best practice standards of diagnosis and treatment following the ending of the SIOPEN INES 99 study in 2004. There has since been no open clinical trial in the UK for neuroblastoma in this age group.

Yielded Benefits:

An extensive list of publications published by ENNCA can be found at the following link: http://worldspanmedia.s3.amazonaws.com/media/siope/wp-content/uploads/2013/06/SIOPE-ENCCA-Scientific-Articles.pdf. Information on how the research has so far benefitted the provision of health and social care can be found within these publications.

Expected Benefits:

The aim of this study is to better understand the outcomes of current treatments for neuroblastoma in infants in relation to the success of first-line therapy (event-free survival) and the burden of treatment received by the individual child and reasons for any differences between countries. Carrying out this research may assist in identifying optimal practice in the clinical care of children with neuroblastoma, and therefore has the potential to benefit the provision of health and social care in England.

Outputs:

It is anticipated that the study findings will be published in peer-reviewed journals and will also be presented at relevant conferences.

Should the opportunity arise, the study may publish findings on the ENCCA webpages, hold open lectures or engage with the press, this will aid the dissemination of the findings and will reach interested groups in civil society.

Processing:

All data held under this Agreement was disseminated to the International Agency for Research on Cancer (IARC) by Public Health England (PHE) prior to its dissolution in October 2021 under the assigned reference of ODR1516_119.

No identifiers were provided to PHE to support this dissemination, a set of inclusion criteria, previously agreed with the National Disease Registration Service (NDRS) analysis team, defines the cohort. The NDRS have previously pseudonymised cancer registration data based on this inclusion criteria.

IARC has conducted statistical analyses on the data provided to fulfil the study's aims and objectives. Training for all Data Users will be organised by the Information Security Officer and the Director for Administration and Financeon a periodic basis.

Data is kept in a safe and secure environment, available only to authorized users with a legitimate need to access them, and protected against unauthorised access. IARC's System Level Security Policy enforces the following controls:

Access Control:
Physical and logical access controls must be established in order to protect the Data at all times. The Data is stored in a secure location requiring either badge or key access to its physical location. Logical access will be controlled with Access Controls Lists (ACL), username and strong password combinations and file and share level permissions.

User Access:
Access to the Data will only be granted by the Principle Investigator (PI). A central log of users having access to the Data will be maintained.

Passwords:
All passwords will be strong in nature. Passwords must never be written down or shared.

Virus Protection:
All IT equipment storing or accessing the Data will have up-to-date anti-virus protection.

Operating System Management:
All IT equipment storing or accessing the data must be updated automatically on a regular basis with the operating system and security patches in order to avoid potential security breaches.

Backup:
The Data will be backed up on a regular basis and stored in a separate location from the original data in order to allow the recovery of the data after a major incident.

Logging:
Logging of access to the Data will be put in place in order to allow a clear audit trail to be maintained of access and modifications made by each authorised User.

Network Security:
The network where the data is stored will be secured to avoid unauthorized access to the information. Network segregation and firewalls should be implemented to increase the safety of the data.


Prevalence, clinical characteristics and impact of body dysmorphic disorder in young people — DARS-NIC-259538-Q4V0W

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2022-03-07 — 2023-09-06 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Mental Health of Children and Young People
  2. Mental Health of Children and Young People (MHCYP)

Objectives:

The research team at University College London (UCL) are requesting access to the 2017 Mental Health of Children and Young People (MHCYP) survey dataset for the purpose of examining the prevalence, clinical characteristics, and impact of body dysmorphic disorder (BDD) in young people.

Body dysmorphic disorder (BDD) is characterised by excessive preoccupation with perceived flaws in physical appearance (most commonly facial features), which appear minimal or completely unobservable to others. Sufferers typically engage in a range of compulsive and repetitive behaviours, such as extreme grooming rituals, often in an attempt to conceal or correct their perceived appearance flaws. The disorder has a devastating impact on quality of life, yet remains under-diagnosed, under-researched and poorly understood. There have been no epidemiological studies of BDD in young people, and therefore many fundamental questions remain unanswered. To this end, this project intends to examine the prevalence, clinical correlates and impairment associated with BDD in young people. This information could have direct implications for the detection and diagnosis of BDD and may identify care needs which could assist those designing, commissioning, and delivering Child and Adolescent Mental Health Services (CAMHS). More specifically, the project aims to answer the following questions:
- What is the prevalence of BDD in young people?
- How does the prevalence of BDD vary with age and sex?
- What are the patterns of psychiatric comorbidity associated with BDD?
- What is the psychosocial impairment associated with BDD?
- Is BDD associated with service utilisation?

The MHCYP 2017 data are uniquely able to address the aims of this project as this is the only population-based survey to include assessment of BDD in young people, either in the UK or internationally.

The MHCYP 2017 survey included 9,117 children and young people aged 2 to 19 years old, who were recruited from a stratified probability sample taken from GP registers. Parents reported on younger children, with additional self-report questions for those aged 11-16. Young people aged 17 and over completed their own questionnaires.

The survey included the Development and Well-Being Assessment (DAWBA), a validated diagnostic assessment tool. The DAWBA assesses a wide range of psychiatric disorders. Parents and young people (aged 11 upwards) complete a series of questions online. Within each diagnostic category, initial screening items are presented, followed by more detailed questions. If screening items are not endorsed then the informant can skip to the next diagnostic category. Parent and child responses are aggregated to determine whether a diagnostic threshold is met.

The DAWBA has been used in previous surveys of child and adolescent mental health in the UK (the British Child and Adolescent Mental Health Surveys), but in the MHCYP 2017 survey the DAWBA was extended to include assessment of body dysmorphic disorder (BDD) for the first time. In addition to the DAWBA, the MHCYP 2017 survey included the Strengths and Difficulties Questionnaire (SDQ), which is a validated dimensional measure of mental health difficulties and impacts. Furthermore, data on the socioeconomic circumstances of the family and the child or young person's contact with services was collected.

The data controller and processor will be UCL. Only those who are substantive employees of UCL will be accessing and analysing the data requested within the UCL secure research facility.

The GDPR lawful basis for UCL to process this data is Article 6(1)(e)'processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller'. As per article 8(c) of the Data Protection Act (DPA) 2018, this “includes processing of personal data that is necessary for the exercise of a function conferred on a person by an enactment or rule of law”.

Power is conferred upon the university by the University College London Royal Charter “to provide education and courses of study in the fields of Arts, Laws, Pure Sciences, Medicine and Medical Sciences, Social Sciences and Applied Sciences and in such other fields of learning as may from time to time be decided upon by the College and to encourage research in the said branches of knowledge and learning and to organise, encourage and stimulate postgraduate study in such branches”.

The legal basis for the processing of special category data is GDPR Article 9 (2)(j), for research or statistical purposes. This is further supported by article 10 of the DPA 2018 that states that an exception can be made to the “prohibition on the processing of special categories of personal data” if "the processing meets the requirement in point (b), (h), (i) or (j) of Article 9(2) of the GDPR for authorisation”.

This project falls under this category because body dysmorphic disorder (BDD) is a major public health concern in the United Kingdom yet remains poorly understood.

Patient and Public Involvement (PPI) will be central to the generation of outputs for this project (see also section 5c), particularly public facing outputs such as blogs, podcasts, and leaflets for schools. Such materials will be developed in conjunction with PPI groups, including those linked with the BDD Foundation. The BDD Foundation is the UK's national charity for BDD and plays a key role in raising awareness of the condition and its treatment. The research lead for this study, who is a substantive employee of UCL, is a clinical advisor for the BDD Foundation, and will work closely with their PPI group to identify relevant forums for dissemination of findings (e.g. relevant podcasts), and to ensure appropriate wording in written outputs.

This project does not link to any wider studies or collaborations.

Expected Benefits:

The first outputs for this project are expected within a year of receiving the Mental Health in Children and Young People (MHCYP) 2017 data. The project will hopefully lead to recommendations which may have an impact on the following categories.

1) Impact on young people with BDD.
Outputs from this work may have direct clinical benefits for young people up to the age of 19 years with body dysmorphic disorder (BDD) across the UK. Findings may also be generalisable internationally. Although the prevalence of BDD is currently unknown, existing research indicates that the disorder is likely to affect 1-2% in adolescents, which equates to approximately 80,000 young people in the UK. BDD is grossly under-diagnosed and there are long delays in young people accessing treatment. This project may help to identify the characteristics of BDD in young people, which could help in raising awareness of this condition. In addition, this project aims to identify demographic and clinical correlates of BDD in youth, which could assist in improving detection and diagnosis of the disorder.

2) Impact on clinical services.
Recommendations based on the study may be able to guide professionals in the detection and diagnosis of BDD, thereby aiding early access to effective treatment. Understanding who is most likely to be affected by BDD (i.e., demographic, and clinical correlates) could inform targeted screening of BDD in within Child and Adolescent Mental Health Services. Similarly, those offering mental health support in schools (e.g. Education Mental Health Practitioners, school nurses and counsellors) could benefit from improved knowledge of the extent of service need and also which groups are particularly vulnerable to experiencing BDD. This could in turn improve prevention and early intervention.

3) Impact on commissioning.
A major beneficiary will be those designing and commissioning Child and Adolescent Mental Health Services (CAMHS) in England, as the research could provide evidence of the prevalence of body dysmorphic disorder and associated clinical needs.

4) Impact on policy.
Policy and commissioning could be impacted at a national level, with beneficiaries including the Parliamentary Health and Social Care Select Committee, and the Women and Equalities Select Committee. These have an important role in holding the Government to account on child mental health policy, and have an interest in addressing body image problems as demonstrated by their recent inquiries into this topic (e.g. “Changing the Perfect Picture: an enquiry into body image” published in April 2021). Other relevant national bodies include NHS England, as well as regional specialist mental health and child health commissioning networks. Policy briefings will be widely disseminated across these groups.

5) Impact on society.
As described above, this project may improve the understanding of BDD in young people thereby increasing early detection, diagnosis and treatment of this condition. Previous BDD research has shown that longer duration of illness is associated with poorer treatment response. Therefore, improving diagnosis and treatment of BDD in youth, which is when the disorder usually emerges, is likely to improve long-term outcomes. This is not only important at an individual level, but may also have benefits at a societal level. If left untreated, BDD is a chronic disorder and associated with unemployment and high levels of service utilization in adulthood. Early diagnosis and treatment is likely to reduce the financial impact of BDD.

Outputs:

This project aims to contribute to a better understanding of the prevalence and impact of body dysmorphic disorder in children and young people. Specific planned outputs are:
- Peer reviewed journal articles of international standing e.g. Journal of Child Psychiatry and Psychology, Journal of the American Academy of Child and Adolescent Psychiatry (Winter 2022).
- Conference presentations to a range of audiences including health and education (Summer-Autumn 2022).
- Blogs and other public facing outputs, including via social media (e.g. the researchers' Twitter accounts; @georginakrebs and @argStringaris). Public facing outputs will target schools, health professionals and the general public. These will be developed in conjunction with PPI groups and partners such as the BDD Foundation, a national charity (Summer-Autumn 2022).

All outputs will involve presentation of aggregate data with small numbers suppressed only, in accordance with the special conditions detailed under this Agreement.

Processing:

NHS Digital are the data controller for the MHCYP 2017 data survey. The survey is being carried out by NatCen Social Research and the Office for National Statistics who are co-data processors.

The collected data is checked, derived further (if required), minimised and pseudonymised (direct and indirect identifies removed). The pseudonymised data asset is then sent to NHS Digital for information and to the UK Data Service (UKDS) (www.ukdataservice.ac.uk) for storage and further dissemination. Before UKDS are able to release the data to UCL a Data Sharing Agreement (DSA) must be signed with NHS Digital.

The UKDS will securely transfer the entire standard MHCYP 2017 dataset to UCL. It is not possible to obtain individual variables. Personal data such as names, addresses and dates of birth are not included, only the unique serial number used to represent participants. To minimise the risk of re-identification in this pseudonymised dataset, the study team will also follow the “Disclosure control for microdata produced from social surveys” guidance set out by the Government Statistical Service.

UCL will electronically store the data in their 'Data Safe Haven', a secure research storage and processing environment, which has security assured under a Data Security and Protection Toolkit. Analysis will take place within the secure research environment. Only study team members who are substantive employees of UCL will have the authorisation to access the data for the purpose(s) described and will access the storage and processing environment using their individual password. Remote access will occur via multi-factor authentication over an encrypted connection. Only aggregate data (with small number suppression applied) will be exported for the purpose of dissemination of findings. The data will only be used for the purposes described in the agreement.

UCL will hold data as per the Data Sharing Agreement length, after which the data will be securely destroyed according to the Data Sharing Agreement between UCL and NHS Digital, unless an extension is applied for and granted.

Only aggregated outputs will be made available to third parties in peer reviewed publications and open access reports. There will be no requirement or attempt to re-identify individuals.

Data management will be done using a statistical analysis package. All analyses will be conducted using survey weights and controlling for complex survey design where appropriate, and for non-response. Descriptive statistics will initially be used, with stratification by age and gender where appropriate. This will be followed by the use of logistic and linear regression models to examine the association of a range of psychiatric comorbidities with BDD, as well as psychosocial impairment and service utilisation.


Educational outcomes in children born after assisted reproductive technology; a population-based linkage study — DARS-NIC-258079-G7W1Y

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, Identifiable (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-02-28 — 2025-02-27 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Birth Notification Data
  2. Civil Registration - Births
  3. Demographics

Objectives:

Infertility problems are common, with approximately 1 in 6 couples experiencing difficulty in conceiving children naturally. Fertility treatments can help many of these couples, and their use is rapidly increasing. Fertility treatments in which either eggs or embryos are handled are called assisted reproductive technologies (ART), and they include in vitro fertilisation (IVF) and other related techniques. Nearly 1 in 50 children in the UK are born to parents who have benefited from fertility treatments. There remains outstanding a major concern however of whether children born after assisted reproductive technologies (ART) are at higher risk of developing learning or behavioural problems.

There are biologically plausible reasons for increased vigilance regarding the development of children conceived after ART. These procedures involve the handling of eggs and embryos outside of the body at a vulnerable period of early human development, which could impact upon development of the nervous system and brain (neurodevelopment) in children conceived in this way. ART also carries increased risk of multiple births, premature delivery, and low birth weight, all of which are adversely associated with neurodevelopment.

The long-term cognitive and behavioural development of this increasing population of children has not been adequately investigated to date, because existing studies have been small, and have not included adequate comparison groups. A recent systematic review summarised that ‘there is insufficient evidence to conclude whether the long-term neurodevelopment of children born after ART is comparable to that of spontaneously conceived children’. The European Society of Human Reproduction and Embryology has emphasised that high-quality research is essential to understand whether or not this increasing population of children are at higher risk of developing any problems as they grow and develop, so that couples considering ART can receive appropriate and reliable information, and so that any problems in children can be identified and managed early.

University College London (UCL) are therefore conducting a population-based cohort study (a research study in which group(s) of individuals are followed over time) to compare educational outcomes in children born after ART procedures with two control (or comparison) groups of naturally conceived children: 1) naturally conceived siblings of the ART children, and 2) unrelated children chosen at random from the same schools (school-matched controls).

A parallel study conducted by UCL investigating children’s health following ART has identified a cohort of children born following non-donor ART procedures (ART procedures which did not involve the use of donor sperm or eggs) throughout England between 1992-2009 (n=86,064), as well the naturally conceived siblings of these children (n=23,299). These cohorts were identified by linking data from the Human Fertilisation and Embryology Authority (HFEA) database with the Office for National Statistics (ONS) birth records. This data is securely held by NHS Digital, with each individual’s records in the dataset pseudonymised using a unique member number (UMN). Pseudonymisation is a technique which helps to protect the personal information of data subjects by replacing information in a data set that identifies an individual with a reference number.

For this study, the proposal is to explore educational outcomes in this cohort of children born following ART procedures as well as their naturally conceived siblings. This will be achieved by linking the data for these children (held by NHS Digital) with the National Pupil Database (NPD), which is held by the Department for Education (DfE). The NPD contains detailed information about the educational achievement of all pupils in state sector schools and sixth-form colleges in England. Following change in legislation with the Digital Economy Act 2017, which aims to enable and facilitate the secure use of data from across the government sector for research, access to NPD data is now being made available for research purposes through the ONS Secure Research Service (ONS-SRS). The NPD will also be used to identify a second control (or comparison) group of unrelated (non-sibling) school-matched controls for the ART Cohort (unrelated children chosen at random from the same schools as the ART conceived children). The DfE will transfer the identifiers (including names, date of birth, sex, and postcode) of the unrelated school-matched controls to the ONS. The ONS will then identify information related to the birth of these children (birth weight, multiple birth status, and maternal age) from the national birth records. This birth information is important to ensure that the educational outcomes in these school-matched controls can be meaningfully compared with the ART conceived children. The ONS will then pseudonymise this birth information for the school-matched controls, and deposit it into the ONS-SRS where the pseudonymised data will be analysed by the UCL research team.

Through the linkage of several large existing national datasets, the proposed study will address an important gap in scientific knowledge regarding the educational achievement of children born following ART. It has a number of important methodological strengths compared to existing studies: much larger size, follow-up of children throughout the full age range of school education in England (4 to 18 years), national coverage, and the inclusion of two control (or comparison) groups of naturally conceived children (siblings of ART children, and unrelated school-matched controls) to provide a more robust and reliable assessment of educational achievement in the ART conceived children.

Data linkage using existing national datasets is the least intrusive, most efficient, and only feasible methodology for adequately investigating this research question with a sufficiently large and representative sample size. The existence of the HFEA database in the UK, with the mandatory recording of all treatment cycles of ART undertaken nationally as a legal requirement, offers an internationally unique opportunity to study the health and development of children conceived following ART. Linkage of the national cohort of children born after ART with the NPD is the only way in which the HFEA database can be utilised to investigate the educational outcomes in these children.

Within the remit of this data sharing agreement, UCL specifically aims:
i. To compare educational outcomes among children born following ART with children born following natural conception.
ii. To compare the frequency of special educational needs (SEN) and school exclusion among children born following ART with children born following natural conception.
iii. To compare outcomes for specific types of ART (e.g. intracytoplasmic sperm injection vs in-vitro fertilisation; fresh vs cryopreserved cycles) and specific causes of infertility.

It is necessary to use data for the whole available cohort of children conceived following ART (born between 1992-2009) for this study. This is because the large size of the study cohort is a key methodological strength of this study, and is necessary for the study to deliver reliable and accurate results. The size of the study cohort is critical to allow the study to have the statistical power to adequately address the research question, by providing sufficient numbers of children in the ART conceived group and the sibling control group at each Key Stage level of assessment in the national educational curriculum. The size of the study cohort is also essential to facilitate planned subgroup analysis, which will compare outcomes separately depending on the type of ART treatment used, and depending on the cause of infertility. Under the terms of the Human Fertilisation and Embryology Act 2008, HFEA data on ART cycles carried out before 01/10/2009 can be used for research, subject to ethical approval, without explicit patient consent. Patients can withdraw consent, but only around 350 have done so. This makes the data prior to 01/10/2009 virtually complete. After this date, an opt-in system for research consent applies and the data is much less complete. The data after 01/10/2009 is not covered by Confidentiality Advisory Group s251 approval and is therefore not available for use in this study.

The principles of data minimisation will be respected by limiting the data transfer to the minimum individuals necessary (individuals within the study cohorts), by limiting the variables (or data items) transferred to the minimum necessary to facilitate the linkage and subsequent analysis (see ‘Processing Activities’), and by pseudonymisation of the data as soon as data linkage is complete (prior to data analysis by UCL). It is not possible to further minimise the data, as the size and geographic spread of the study cohort across England are essential for the study to have sufficient statistical power to adequately investigate the research question. The ART cohort includes all children born in England following IVF procedures between 1992-2009.

With respect to the processing of information flowing from NHS Digital, the Data Controller for the study will be University College London (UCL). The study purpose and design has been determined by the research team at UCL.

UCL, the DfE and the ONS will be Data Processors. The DfE are processing the data for their respective linkage step (combining data for the study groups with the National Pupil Database). UCL research staff will process the record level pseudonymised data after data linkage processes have been completed. The linked pseudonymised data will be held within the ONS-SRS for analysis by UCL research staff. The ONS-SRS is a system hosted by the ONS which will store the linked record level NHS Digital data. ONS staff check all output to be transferred out of the SRS to ensure that it is aggregated with small numbers suppressed (so that information regarding individuals is not disclosed), and to ensure that it cannot be combined with other data sources to identify individuals.

The main ethical consideration regarding the proposed data dissemination is the use of individuals’ data for research without individual consent being sought. Favourable independent review of the study proposal by a Research Ethics Committee (REC) and the Health Research Authority Confidentiality Advisory Group (CAG) has confirmed that use of individual data without consent is justified on the basis of the public interest of this high-quality research and the safeguards in place to maintain data confidentiality (pseudonymisation of the data before it is made available to the UCL research team for analysis, and the data security assurances for the data controllers and processors).

Access to personal identifiable data from NHS Digital will be restricted to a limited number of staff at the DfE who are experienced in handling sensitive data and who are bound by confidentiality agreements as part of their employment contracts and codes of practice. Robust security measures will be in place for the transfer and storage of personal identifiable and sensitive data, and the UCL research team will not have access to any personal identifiable data at any stage. The minimum necessary patient identifiers will be transferred between organisations for the purposes of linkage, alongside birth weight, multiple birth indicator and maternal age (with no transfer of other fertility, treatment or educational data between the organisations performing linkage).

The study has been funded by a research grant from the Nuffield Foundation. The Nuffield Foundation has no role in the design or conduct of the study, and will not be processing or accessing any data. One of co-investigators in the study team is based at the University of Oxford and Birkbeck University of London. They have an advisory role only, and will advise upon the statistical analysis and interpretation of the data. They will not have access to individual level data in the ONS SRS. The University of Oxford and Birkbeck University of London are not considered to be Data Controllers or Processors for the study.

The legal basis for the data processing proposed for this study is that of Public Task, as set out in Article 6(1)(e) of the GDPR. Universities are classed as public authorities for the purposes of data protection law. When carrying out research, as is proposed in this agreement, UCL will be carrying out tasks in the public interest in its capacity as a public authority. The legal basis for processing special category data is under Article 9(2)(j) of the GDPR (processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes). Appropriate safeguards will be in place when processing data in accordance with Article 89(1) of the GDPR. All data made available to the UCL research team will be pseudonymised using a unique member number (UMN). The UCL research team will not have access to individual identifiable data at any stage. Individual identifiable data will be accessible only to a limited number of DfE staff responsible for conducting the data linkage, within secure DfE internal systems.

Legal provision for the processing of identifiable data from the HFEA Register for the purposes of medical research is provided under section 33D of the Human Fertilisation and Embryology Act 2008. UCL have received approval from the HFEA’s Register Research Panel for the use of HFEA Register data for this project.

Yielded Benefits:

This is a new Data Sharing Agreement. No data has been disseminated by NHS Digital for this research study. There are therefore no yielded benefits to date.

Expected Benefits:

The HFEA regulates fertility treatment in the UK, with HFEA policy having the potential to influence the large number of fertility treatments performed each year - approximately 54,000 women received ART in the UK in 2018. UCL will provide the HFEA with a report of the study findings via the HFEA Register Research Panel, and anticipate that the results of the study may influence policy decisions by the HFEA. For example, there are concerns at present that some more invasive and expensive forms of ART are being over-used in some treatment centres, with wide variation in practice between centres. Should the results of this study demonstrate potential risks associated with particular types of ART, this may influence HFEA guidance regarding the use of different types of treatment, potentially saving costs for the health service and/ or distress for the recipients of ART. Any change in HFEA guidance is likely to directly influence the clinical practice undertaken by fertility clinics.

The HFEA have also articulated concern about the use of ‘add-on’ treatments by private fertility clinics, and the linkage methodology established by this study is expected to facilitate replication of the study in the future for evaluation of the safety of emerging technologies. Measurement of these potential benefits are intended to be through changes in HFEA guidance and policy regarding fertility treatment, and ongoing analysis of national ART treatment practices in the HFEA annual fertility treatment statistics.

Publication of the research findings in peer reviewed journals and presentation at scientific conferences is anticipated to enable dissemination of the results to a wider medical, educational, and scientific audience. This is intended to enable the results to influence clinical practice in fertility medicine, including beyond the remit of the HFEA such as internationally. It is also expected to help inform education providers as to whether this group of children may benefit from additional educational support, and enable the results to influence future academic research in this field.

UCL’s engagement with the HFEA and the media is anticipated to help disseminate the results of the study to the general public. The study’s findings are expected to be of considerable interest to families who have used or are considering ART. Previous research has shown that many parents who have used ART are concerned whether their children are at higher risk of developing learning or behavioural problems as a result of the fertility treatment. There are biologically plausible reasons for increased vigilance regarding the neurodevelopment of children conceived after ART, because of the potential for ART procedures to influence the nervous system during a vulnerable period of development. This important aspect of the health and development of ART conceived children has not been adequately investigated. As educational performance is a key measure of children’s neurodevelopmental progress once they reach school age, this study is expected to provide families with robust information that addresses these concerns.

Although UCL do not anticipate that the study’s findings would directly influence couples’ decisions about whether or not to use ART, couples using ART treatments experience considerable anxiety and psychological stress both during and after treatment. Should the study show there to be no difference in the educational attainment of ART conceived children compared to naturally conceived children, it is anticipated that this will provide considerable reassurance for families. The resulting reduction in anxiety experienced by parents of ART children has the potential to improve their wellbeing and family functioning. The study has the potential to benefit tens of thousands of families annually across the UK. Measurement of this potential benefit is likely to be through ongoing qualitative research and surveys exploring the experiences of people using ART, such as the HFEA national fertility patient survey.

Should the study show evidence of reduced educational attainment in ART conceived children, this will identify that these children may benefit from additional educational support. This could influence the decisions made by parents and education providers regarding the provision of such extra educational support for their children. It might also stimulate further research to explore the specific domains in which children may experience difficulties and benefit from support. Targeted extra support for children could have the potential benefit of minimising or overcoming any identified attainment gaps, thus improving the educational outcomes of these children. This has the potential to benefit a large number of children annually; over 18,000 children were born following ART in the UK in 2018.

Dissemination of the research findings to families using ART and to education providers is planned to be achieved via publicity through various stakeholder organisations, including Fertility Network UK, the HFEA and the Nuffield Foundation. Measurement of the potential benefits of this dissemination is proposed to be through qualitative research and surveys exploring the experiences of families using ART, as well as through follow-up studies of educational outcomes in children conceived by ART (for example by repetition of the linkage methodology used for this study).

The Confidentiality Advisory Group specifically considered the public interest of the research, and concluded: “The Group discussed the application and agreed that this defined a clear medical purpose which was in the public interest as high-quality research which was essential to understand whether or not this increasing population of children born following assisted reproduction therapy (ART) were at higher risk of developing learning or behavioural problems as they grow. This information would also enable individuals considering ART to receive appropriate and reliable information, and ensure that any problems in children can be identified and managed early.”

Outputs:

Outputs are expected to include:
a) Peer reviewed publications in medical journals. The main results of the study are intended to be submitted for publication in a high impact general medical journal (such as The Lancet, The Journal of the American Medical Association (JAMA), or the British Medical Journal (BMJ)). Publications will be Open Access as per UCL policy, and freely available via both journal websites and UCL webpages.
b) National and international conference presentations. Proposed conferences for presentation include the European Society of Human Reproduction and Embryology (ESHRE) Annual Meeting (2023, date not available yet), and the American Society for Reproductive Medicine (ASRM) Scientific Congress (October 2023).
c) Report for the study funder (Nuffield Foundation), which will be publicly available via the study webpage on the Nuffield Foundation website.
d) Brief lay summary report, which will be made publicly available via the websites of stakeholder organisations (HFEA, Fertility Network UK, Nuffield Foundation, and UCL)

Outputs will contain only aggregate level data with small numbers (<10) suppressed in line with ONS and DfE policy.

Dissemination of the research findings to researchers and scientists will involve presentation at national and international conferences and publication in peer review medical journals, as detailed above.

Dissemination of the research findings to the public (key stakeholders being parents who have used or are considering ART) will be facilitated through existing collaborations with the HFEA and Fertility Network UK (the leading patient organisation supporting people suffering from infertility). The research project has already been informed by Patient and Public Involvement work, with UCL conducting a survey of parents of children born following ART, which determined that children’s educational potential was one of the most common concerns they had.

Dissemination of the research findings to a lay audience will be in the form of a brief research report and a video summary. Communication of the research findings to the public will be via the websites of stakeholder organisations (HFEA, Fertility Network UK, Nuffield Foundation, and UCL), and via the newsletters and social media channels (e.g. Twitter) of these organisations.

Research regarding fertility treatments, including previous work of the UCL research team, has attracted a high level of media interest, and the team anticipate that this will be the case for the proposed study. The team are acutely aware of the potential harmful effect of inaccurate or sensational reporting of research findings in this sensitive area, and the confusion and anxiety this can cause for couples and parents. The team will work closely with the HFEA, Fertility Network UK & UCL to co-ordinate press releases and ensure that information is conveyed accurately and responsibly.

The research team will commence data analysis as soon as the linked data has been made available. The team would anticipate that the process of data analysis, interpretation and report writing would take approximately 12 months. The team anticipate that the analysis will be completed and outputs generated in late 2023, although this estimate is dependent upon the timeframe for data access approvals being obtained and the data linkage being completed.

Processing:

The study will involve the following data processing and linkage steps:

1. Linkage of the ART Cohort and Sibling Cohort to the NPD:

Individual level data held by NHS Digital under DARS-NIC-180665-GJMW5 regarding the ART and Sibling Cohorts will be securely electronically transferred to the DfE, for linkage to the NPD. Data to be transferred will include individual identifiers required to facilitate linkage (forename, surname, gender, date of birth, postcodes at multiple time points, and unique member number (UMN)), as well as existing variables in a separate pseudonymised bridge file that will be used as covariates in subsequent analysis (birth weight, multiple birth indicator, mother's month and year of birth). The DfE will not attempt to match individuals' birth information to their identifiable data. In order to optimise linkage to the NPD - which records postcode data yearly over the course of individuals’ period of school education - NHS Digital will transfer all available postcodes for individuals in the study cohorts from 2004 onwards (the period for which the NHS demographics dataset is available). Prior to data transfer, NHS Digital will exclude individuals who have submitted a National Data Opt Out.

Substantively employed staff at the DfE will perform the linkage to the NPD on-site in DfE internal systems, with development of the linkage algorithm in collaboration with the UCL research team. The DfE will link the ART children, and their naturally conceived siblings, to their educational outcome data (from the Early Years Foundation Stage Profile through to Key Stage 5), as well as to information regarding secondary outcome measures (special educational needs and school exclusion).

2. Identification of school-matched controls:

Staff at the DfE will identify unrelated (non-sibling) matched controls for the cohort of ART children, using a 1:1 matching algorithm developed alongside UCL. For each level of Key Stage outcome data available for each child in the ART Cohort, one matched control will be identified and randomly selected from the NPD database, with matching by school, age, and gender. A different matched control will be identified for each level of Key Stage outcome data available for each ART child. This is because children can change school during follow-up, and because a single control child may not have outcome data available at all of the Key Stage levels for which the matched ART child has data available. Linking the matched controls back to the ART Cohort file will facilitate a check that the matched controls were not conceived by ART. If any matched controls are found to be ART conceived, they will be replaced with an alternative randomly selected matched control who is not ART conceived.

UCL will request the following outcome and demographic data held in the NPD for children in the study cohorts; test scores for national ‘Key Stage’ pupil assessments at fixed points during school education between 5-18 years of age (Early Years Foundation Stage Profile through to Key Stage 5), special educational needs (SEN) status, school exclusion status, ethnicity, eligibility for free school meals, main language, and Income Deprivation Affecting Children Indices (IDACI) score.

3. Linkage to ONS birth records for school-matched controls:

In order to ascertain information regarding key potential confounding variables (birth weight, multiple birth status, and maternal age) for the unrelated school-matched controls, a minimum number of individual identifiers (names, dates of birth, gender, and postcode) for the unrelated school-matched control group will be securely transferred from the DfE to ONS for linkage to ONS birth records. The data linkage will be performed by ONS staff, within ONS internal systems. N.B. This data linkage step is outside of the scope of the data sharing agreement with NHS Digital, occurring under a separate agreement between UCL, the DfE and ONS.

4. Creation of final dataset for analysis:

Once data linkage processes have been completed, the data will be pseudonymised by DfE and ONS staff using the UMN. The pseudonymised dataset will contain demographic data (gender, age, age within academic year, ethnicity, area-based index of deprivation (IDACI), and eligibility for free school meals) as well as educational outcome data for children in all three groups (ART Cohort, Sibling Cohort, and unrelated school-matched controls). The pseudonymised datasets will be transferred from DfE and ONS internal systems to the ONS-SRS system to allow access by the UCL research team for data analysis. The bridge file of birth covariates (birth weight, multiple birth status, and maternal age) transferred from NHS Digital to the DfE, keyed on UMN, will also be onwardly transferred into the ONS-SRS and not retained by the DfE. The SRS is the ONS facility for providing secure access to sensitive detailed data for accredited Approved Researchers, and conforms to the NHS Data Security and Protection Toolkit. Named Approved Researchers are able to access data related to their project either through a secure online connection to the SRS system from a University computer, or at a physical desk in one of the ONS offices (designated Safe Settings).

Data regarding fertility treatment for the ART cohort is held by UCL in pseudonymised form (using the UMN), under a data sharing agreement with the HFEA. This will be securely transferred from UCL to the ONS-SRS. This data will include the type of ART used and the cause of infertility. The UCL research team will link this data to the demographic and outcome data for the cohort within the ONS-SRS system, using the UMN.

The UCL research team will conduct analysis of the final pseudonymised dataset through the ONS-SRS. At no time will UCL have access to identifiable data. Access to the data will be restricted to a limited number of named UCL research staff working on the project (both substantive employees of UCL and honorary contract holders). Named UCL research staff responsible for conducting the analysis for the project will complete the ONS Researcher Accreditation process, which involves specific training in the safe use of research data environments. They will sign and adhere to the ONS Accredited Researcher Declaration, and will be required to adhere to ONS data protection policies and procedures. This training in data protection and confidentiality through the ONS Researcher Accreditation processes, and the contractual agreements with the ONS to protect the data through the ONS Accredited Researcher Declaration, will apply to both substantive UCL employees and honorary contract holders working on the project.

In the data analysis, performance at each Key Stage level will be compared between the ART group and each of the two control groups (naturally conceived siblings and unrelated matched controls) using linear regression models. Because the data are matched, mixed effects linear regression models will be used. Regression models will also be used to analyze the secondary outcomes (SEN and school exclusion). Because these outcome variables are binary (yes/no) variables, logistic regression models will be used. Analyses will control for birth weight, multiple birth status, birth order, and age within academic year (by month). Analysis involving matched, unrelated controls will additionally control for ethnicity, maternal age at time of birth, postcode-linked social deprivation score (Income Deprivation Affecting Children Indices), and eligibility for free school meals (as a marker of poverty). Subgroup analyses will be performed for children conceived using different types of ART treatment (intracytoplasmic sperm injection vs in-vitro fertilization, fresh vs. cryopreserved cycles) and also for children whose parents have specific causes of infertility (female factor, male factor, both male and female factors, unexplained).

Individual level data will at all times be held within ONS systems, and at no stage will individual level data be transferred to or processed within UCL systems. All data to be transferred out of the SRS (the results of the analyses) will be checked by ONS staff to ensure that no individual level data, or potentially identifiable data, is transferred. Only aggregate level data, with small numbers (<10) suppressed, will be transferred out of the SRS system for publication. When the UCL research team access the pseudonymised dataset in the ONS-SRS they will be performing all analyses within the ONS-SRS system, and will only be extracting aggregated data.

Identifiers that the research team at UCL will have access to (ethnicity, gender, and month & year of birth) are important potential confounders to control for the analysis. The identification of research participants will not be possible from these variables. UCL does not hold any other information which could allow identification of individuals through interaction with the pseudonymised data that UCL will receive during the study (from NHS Digital, the DfE, and the ONS). There will be no requirement or attempt to reidentify individuals.

5. Data Retention:

Individual identifiable data for the study cohort will held by the DfE for a period of three months following completion of the data linkage. This will allow for DfE staff to investigate any data anomalies or discrepancies, should they be identified by the UCL research team during data cleaning and analysis. The DfE will securely delete all identifiable and birth covariate data for the study cohort after this three-month period. Only the pseudonymised analysis dataset will be retained after this point, to facilitate analysis by the UCL research team.


NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).


Creating synthetic data for health research — DARS-NIC-419453-G3G1G

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2021-03-29 — 2024-03-28 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care
  2. Hospital Episode Statistics Admitted Patient Care (HES APC)

Objectives:

The study aims to evaluate methods for creating ‘synthetic’ datasets for health research. The idea is to create artificial datasets that ‘look’ like the original data source (preserving the structure and statistical properties of the data and relationships between variables) but that do not contain information on any real individuals, and therefore pose no confidentiality risks. If such datasets can be created, synthetic data could be used by researchers to understand the structure of the data, develop data cleaning protocols, codes and algorithms, and test out methods. Final analyses could then be conducted once approvals are in place (in secure settings), or alternatively, by the data providers themselves (so that researchers would not need any access to confidential information).

The concept of synthetic data is not new, but its use is increasing. For example, synthetic versions of general practice data (from the Clinical Practice Research Datalink) have recently been generated, including for COVID-19 research (https://www.cprd.com/content/synthetic-data). Health Data Research UK have also recently prioritised work in this area (for more information, see https://www.hdruk.ac.uk/synthetic-data/).

To achieve this aim, the research team at University College London (UCL) will compare a range of methods for synthetic data generation for generating synthetic versions of Hospital Episode Statistics (HES) in particular Admitted Patient Care (APC).

Legal basis
The legal basis for processing personal data for this purpose data at UCL falls under Article 6(1)(e) of the General Data Protection Regulations (GDPR), i.e. “a task carried out in the public interest”. It also falls under Article 9(2)(j), “processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”.

There is clear public interest for this application, as it could lead to a significant streamlining of research using electronic health data. The potential for using electronic health data for timely health research has recently been highlighted by COVID-19. Strict governance restrictions to protect confidentiality mean that data, when released, are highly anonymised and/or only accessed in secure research settings. This is for good reason - there are some who argue that individual-level data can never be truly anonymous. However, methods to anonymise data, such as by removing exact event dates or categorising variables, can mean that the resulting data are not sufficiently granular for research purposes. Synthetic datasets could provide an alternative resource for health researchers.

Findings from the study will help data providers decide whether providing synthetic versions of electronic health data could help address the increasing pressure to deliver timely outputs in the context of increasing numbers (and complexity) of data access applications. Sensitive or potentially identifiable datasets such as Hospital Episode Statistics have great potential for economic and social impact, leading to better informed policy decisions and effective public services. Widening the use of these data through synthetic data would therefore lead to increased efficiency in health research, ultimately benefiting the public.

The researchers will access pseudonymised data only (as only pseudonymised data has been released under NIC-393510).

How the data requested will achieve the aim
The HES data requested will achieve the aim of this study by providing an ‘original’ data source that can be used as a basis for evaluating a range of synthetic data generators. For example, the UCL team will select a core set of variables to be synthesized including patient characteristics (index of multiple deprivation score (IMD), ethnic group, sex), birth outcomes, number of admissions, and high-dimensional fields such as diagnosis fields.

Exploring whether it is possible to accurately generate ‘lookalike’ variables in a synthetic dataset will help inform researchers and data providers on the value of synthetic data. Information from HES will also allow the researchers to determine for which types of data the methods are effective. For example, they will be able to determine whether the methods can be used to generate synthetic versions of early HES data (from 1997, likely to be less complete) and more recent data (from 2019, more complete).

The data will be used to evaluate three synthetic data generators, Synthpop, Simulacrum and Jomo. Synthpop and Jomo are implemented in open source software (R packages). Simulacrum was developed by Health Data Insight (a social enterprise overseen by the Office of the Regulator of Community Interest Companies). UCL will evaluate these generators, in terms of how well they can create synthetic datasets, by the following:
- Assess general utility by visualising marginal distributions of key variables and by estimating the standardised propensity mean square error (pMSE). The pMSE is a measure of poor discrimination between the original and the synthetic data (a positive feature in this context) and is derived from a logistic regression model for the propensity of a record to be from the original dataset. Coefficients of the propensity model will be inspected to identify ways in which the synthetic and original data diverge.
- Assess specific utility by estimating coefficients from selected models of interest using the synthetic and the real data, and then deriving standardised differences and percentage bias for the coefficients of interest. UCL will assess the extent to which inferences based on the synthetic data are robust, by estimating the overlap of confidence intervals for coefficients derived from the original and synthetic datasets using the interval overlap measure. Results will be averaged across multiple versions of the synthetic data.

Relevant background to the request
This is a methodological research study funded by the Economic and Social research Council (ESRC).

Relationship between proposed project and associated work
UCL are proposing to re-use an existing extract of HES APC data (held by members of the wider research team under a separate DSA; NIC-393510). UCL are requesting a re-purposing of this DSA to allow them to access specifically the years 1997/98 and 2019/20, which will enable them to establish whether methods can be used to generate synthetic versions of both early HES APC data and more recent, more complete data.

The purpose of the request
The research team are requesting that health data captured in HES are used to evaluate whether it is possible to generate synthetic versions of health data that can be used for health research. The purpose of this request is to answer a set of research questions about the feasibility and usability of synthetic data, aiming to generate evidence on the usefulness of synthetic data for data providers and health researchers.

For this purpose, the research team are requesting pseudonymised HES APC data for 1997/98 and 2019/20. National data are required in order to capture the variation in data quality across different providers and to evaluate whether synthetic data methods can handle large amounts of data accurately. There are no alternative ways of achieving the purpose of this application. The research team will use the minimum data required in order to answer the research questions.

Organisations involved
Data controller: UCL
Data processor: UCL

Although the study involves a co-applicant at the London School of Hygiene and Tropical Medicine (LSHTM), they will only contribute advice (particularly on the use of the Jomo package). They will not access any HES data. LSHTM is not considered as a Data Controller or Data Processor.

There are no funders or commissioners directly involved in the project.

No party involved in the application will receive any form of commercial benefit from the use of data.

Expected Benefits:

The research will benefit the provision of health care and the promotion of health, by informing policy on whether synthetic versions of electronic healthcare datasets can be shared with researchers, in order to streamline the research process. This will have direct relevance to all health research using healthcare datasets such as HES. The research is in the public interest, because the public have vocalised opinions about the need for timely access to high quality healthcare data, especially in light of COVID-19. The results of this study will provide evidence on whether synthetic data can be used to speed up the data access applications, data management, and data analysis stages of research. The study will directly benefit the Health and Social Care sector by providing data providers, researchers, governance bodies and policy makers with detailed and up-to-date evidence to aid decision making about the use of synthetic data to support a wide range of health research. Our guidelines on the appropriate use of synthetic data will help facilitate timely access to administrative datasets, improve the efficiency of research and streamline approval processes in the context of increasing demands on data providers. Our work will help minimise access to identifiable personal data, by allowing researchers to develop methods using anonymised, synthetic data, with final models being implemented by data providers or within secure settings.

The study team will engage with data providers, researchers and the public throughout the study. UCL will offer workshops on synthetic data and the results of our study with NHS Digital, DfE and ONS. This will allow the study team and data providers to establish views on the resource implications associated with synthetic datasets, and the likely efficiency benefits of being able to provide synthetic datasets to researchers.

Outputs:

The main output will consist of a set of guidelines and evidence on the use of synthetic data, including a comparison of approaches. These guidelines will be developed alongside data providers and other researchers as part of an engagement phase of the study. UCL will disseminate the guidelines using existing networks, e.g. including with colleagues at the Office for National Statistics (ONS), the Department for Education (DfE), NHS Digital and Public Health England (PHE), as well as Administrative Dara Research UK (ADR UK) and Health Data Research UK. Findings will be used by data providers to inform ongoing research into the use of synthetic data. Findings will also be published as peer review publications in high quality journals (e.g. PLoS One, submitting within 3 years of data access). The researchers will also work with members of the public to co-produce a range of outputs suitable for communicating results to members of the public interested in the use of electronic health data for research, e.g. through training events or fact sheets.

All journal articles will be published with open access, to ensure the wide dissemination of the study’s results to data providers, healthcare professionals, governance bodies, and other researchers. Results of the study will also be made available in both clinical and methodological research forums: abstracts will be submitted to the following conferences within 2 years of data access: International Population Data Linkage Network, Public Health Science.

As the main output will be guidelines on the use of synthetic data, and will not inform any decisions about individuals, the researchers do not expect that an a EQIA will be required. However, the guidelines will include as assessment of how well synthetic data can preserve information in the real data about people with protected characteristics (e,g., to ensure that ethnic groups are represented in the same way within the synthetic and real data).

Data will not be used for sales or marketing purposes.

Processing:

The research team at UCL will extract a core set of variables including patient characteristics (IMD, ethnic group, sex) birth outcomes, number of admissions and high dimensional fields such as diagnosis fields, from an existing HES APC extract held under a separate Data Sharing Agreement (NIC-393510). The intention is to use this original data set and replace all of the values with synthetic ones, causing minimal distortion of the statistical information contained in the original data set. This would result in a new dataset, in which every value for every variable will be synthetic, i.e. the synthetic data will not contain any records that correspond to a real person. The new extract will be transferred to a new location in the UCL Data Safe Haven. All analysis will take place within the UCL Data Safe Haven.

The data flows are summarised as follows:
1. HES APC data with no identifiers will be extracted for 1997/98 and 2019/20 from NIC-393510-J1Q6T
2. The new extract will be transferred to a new location on the UCL Data Safe Haven.
3. All analysis will take place in the UCL Data Safe Haven.

There will be no attempts to identify individuals. Risk of re-identification will be mitigated by checking all outputs for small cell sizes. No potentially disclosive outputs will be shared or published. Data processing will only be carried out by substantive employees of UCL who have been appropriately trained in data protection and confidentiality.


Centre for Longitudinal Studies - Millennium Cohort Study (MCS)- (Age 17 consent) — DARS-NIC-384504-N2V5B

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, No, Identifiable (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2021-01-14 — 2022-01-13 2021.09 — 2024.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: Yes

Datasets:

  1. Emergency Care Data Set (ECDS)
  2. Hospital Episode Statistics Accident and Emergency
  3. Hospital Episode Statistics Admitted Patient Care
  4. Hospital Episode Statistics Critical Care
  5. Hospital Episode Statistics Outpatients
  6. Hospital Episode Statistics Accident and Emergency (HES A and E)
  7. Hospital Episode Statistics Admitted Patient Care (HES APC)
  8. Hospital Episode Statistics Critical Care (HES Critical Care)
  9. Hospital Episode Statistics Outpatients (HES OP)

Objectives:

The Centre for Longitudinal Studies (CLS) at University College London (UCL) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. It is responsible for four of Britain's internationally renowned longitudinal cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study, the Next Steps and the Millennium Cohort Study (MCS). All these studies are following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being. The purpose of this application covers two aspects:

a) Request linkage of Hospital Episodes Statistics (HES) and Emergency Care Data Set (ECDS) data to a subset of the MCS (only cohort members who consented to have their health records linked to their survey data)
b) CLS seeks permission to sub-licence this linked data with the research community via the UKDS.

MCS is renowned worldwide for the evidence it provides on children’s experience of growing up in the United Kingdom in the 21st Century. Since the study’s launch there have been seven attempts to re-contact and gather information from the whole cohort (at ages 9 months, 3 years, 5 years, 7 years, 11 years, 14 years and 17 years). The MCS covers such diverse topics as parenting; childcare; schooling and education (e.g academic qualifications, vocational qualifications); daily activities and behaviour; cognitive development; child and parent mental and physical health; employment and education; income and poverty; housing, neighbourhood and residential mobility; and social capital, ethnicity and identity. The information collected in previous sweeps of the study has formed the high quality data resource, that is MCS, for scientific investigation across the life course and domains.

The seventh, Age 17 survey (2018-19) added to the data already collected in previous sweeps by updating information on current circumstances of the cohort and experiences they have had since the last sweep. In previous sweeps, schooling will have been the main activity common to the vast majority of cohort members. The Age 17 survey marked an important transitional time in the cohort members’ lives, where educational and occupational paths can diverge significantly. It is also an important age in data collection terms since it may be the last sweep at which parents are interviewed and it is an age when direct engagement with the cohort members themselves rather than their families is crucial to the long term viability of the study. To reflect this, CLS conducted face to face interviews with the cohort members for the first time. Cohort members were also asked to do a range of other activities including filling in a self-completion questionnaire on the interviewer’s tablet, completing a cognitive assessment (number activity) and having their height weight and body fat measurements taken.

This was a unique opportunity to measure factors that underlie different types of transition into adult life, which may affect future wellbeing in unprecedented ways. Capturing these transitions well, alongside the contemporary factors underlying them was critical. It was important to build up a picture of daily life, including factors such as: relationship with parents, family and peers, risky behaviours, social media engagement and efforts on activities such as education /school. Additional factors affecting decisions at this age include attitudes and preferences, such as preferences for education, attitudes to risk, willingness to trade off resources at different points in time, and expectations about future life events. Measuring social and emotional development, mental health and cognitive development and using well-validated instruments, was also a critical component of the survey.

During this survey, CLS also obtained informed consent from cohort members for their health data to be linked to the data collected in the study. In total, consent to health linkage was obtained from approximately 6118 cohort members in England. These are the cases which CLS is seeking permission to link to HES and ECDS data. More information about this survey can be found here- https://cls.ucl.ac.uk/wp-content/uploads/2020/09/MCS7-user-guide-Age-17-ed1.pdf

Linking health data from HES and ECDS to the MCS survey data will greatly increase the possibilities for using the cohort to study how health outcomes impact on the individual aspects of their life such as education, work, relationships and family life and, likewise, how health outcomes relate to the individual behaviours and lifestyles aspects such as drug and alcohol use, sexual health, diet and exercise, which are all documented as part of the study. The successful inclusion of HES and ECDS data will enrich these data by revealing which cohort members have been admitted to or attended hospital and the reasons for this, e.g. drug and alcohol treatment, accident and emergency, maternity and mental health services which could help CLS better understand how health conditions could be better treated or supported. This kind of analysis necessitates pseudonymised record level data.

Data about health behaviours may be more accurate if obtained from administrative records as a result of misreporting of complex health conditions, under-reporting of particular health problems or due to perceived sensitivities around certain behaviours and lifestyle aspects. There are no alternative, less intrusive ways of obtaining such information. This also offers an interesting methodological opportunity to validate the data collected in the survey and vice versa.

CLS at UCL is requesting data from 2001 (where available) to most recent data available. The first data collection of the MCS study happened in 2001 when cohort members were 9 months old. CLS therefore wants to access the historical information for its cohort members. Health events can be experienced over an extended period. The objective HES/ ECDS data will complement and enhance the existing survey data, also improving the accuracy of the data collected in the survey. The large data range will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of cohort members, offering huge potential for scientific and policy-related research. This will build on the extensive body of work focused on the millennium cohort as given in ‘Yielded Benefits’.

The MCS study follows the lives of young people across England, Scotland, Wales and Northern Ireland. As this project requests data linkage to the MCS study, the geographical spread of the data requested will be across England.

CLS at UCL have considered data minimisation in terms of what CLS needs but further minimisation is not possible. The data requested in this application will be part of a database, created to serve various research projects. Data minimisation will be applied when CLS sub-licences the data to third parties. Third party organisations wishing to access the data will need to specify the variables needed and will only be given access to a sub-set of the data which is needed to conduct their research.

The overall aim of the linkage is to:

1. Validate, enhance and improve the quality of the cohort data, in this way creating a uniquely rich administrative/survey linked data set (HES/ECDS - MCS).
2. Use this data set to produce methodological papers on the quality of the data (eg around measurement, representativeness) and research papers helping to showcase its benefits for health and social care.
3. Promote and make possible wider use of this linked data set, through providing wider access to the linked NHS Digital HES / CLS MCS data to the research community via the UK Data Service (UKDS) Secure Lab, through a sub-licensing agreement agreed between CLS and NHS Digital

THE SUB-LICENCE:
Sub-licensing of the data will be in line with the DARS sub-licensing data standard: https://digital.nhs.uk/services/data-access-request-service-dars/dars-guidance/sub-licencing-and-onward-sharing-of-data

UCL seeks permission to include onward sharing of the linked HES/ ECDS and CLS MCS data with the UK Data Service (UKDS), where data can be accessed by accredited researchers in a Secure Research Environment, known as Secure Lab, following a "Sub-licensing model".

The UKDS is funded by the Economic and Social Research Council (ESRC) with contributions from the University of Essex, the University of Manchester and Jisc (Jisc is a United Kingdom not-for-profit company whose role is to support post-16 and higher education, and research, by providing relevant and useful advice, digital resources and network and technology services, while researching and developing new technologies and ways of working). The UKDS provides access to high quality data to meet the data needs of researchers, students and teachers from all sectors including academia and central and local government.

The UKDS is based at, and hosted by, the University of Essex. The University of Essex are therefore listed as a data processor and also listed in the data processing and storage location sections. Only staff who are permitted to work at the UKDS (and are substantively employed by University of Essex) will process the data. Should any substantively employed researchers from University of Essex wish to use the UKDS data, they will be required to apply via the sub-licence route, the same as other researchers from other organisations.

Under the "Sub-licensing model", NHS Digital shares data with UCL who are in turn licensed to share these data with other organisations, subject to agreed controls, scoped in this agreement between NHS Digital and UCL. In line with this onward sharing model, the data sharing controls in place between NHS Digital and UCL are replicated between UCL and the other organisations. UCL, which houses CLS at the UCL Institute of Education, is fully accountable for the actions of the parties involved in subsequent data share and use. The agreement mirrors the Data Sharing Framework Contract in place between NHS Digital and UCL. It also requests information about the research proposal, benefits to health and/or social care, organisational security assurance and terms and conditions regarding onward sharing of data, responsibilities and processing activities etc.

Under the sub-licensing model, CLS will deposit the linked data with UKDS who will serve as a data repository. Access to the deposited data will be granted to approved researchers within a Secure Research Environment on behalf of UCL as outlined below. NHS Digital will retain the ability to directly audit UKDS's compliance with the outlined and agreed data access arrangements.

The anticipated volume / number of licences is 1-2 sub-licences per month, and the potential length of the sub-licences is 2-3 years in length.

There will be no charge applied to licenses supplied by UCL.

The territory of use in the sub-licence will be the same or narrower than the territory of use stated in this data sharing agreement, namely the UK.

In this sharing model of the linked data, UCL will be a data controller, determining the purposes for which and the manner in which the linked data are processed. UKDS will be a data processor, as they will be processing the data on behalf of UCL. This includes holding the linked data in a secure environment, screening for completeness of applications for data access, providing training for use of linked data securely, entering into contractual agreements with approved researchers, extraction of approved data and setting up access systems, and approving statistical outputs, following a statistical disclosure control procedure.

The approved organisations and researchers who are granted an access to the linked data via the UKDS Secure Lab, agree to terms and conditions of use, their rights and responsibilities as users of the linked data, as defined by the UKDS. In addition to the agreements signed with the UKDS, the organisation of the researcher applying to use the linked data will enter into a Licence agreement with UCL.

ORGANISATIONAL AGREEMENTS
UCL will provide a sub-licence to UK organisations undertaking research that will be of benefit to the public (this will be assessed in the project proposal form submitted to the UKDS and to UCL). Applicants (potential licensees) will need to show that the provision of the sub licencing will be in the public interest and that the data will be used either (i) for the provision of health care or adult social care; or (ii) for the promotion of health. UCL will not provide data access to commercial organisations for research for commercial purposes. Additionally, the UCL Licence agreement will assess the project proposal against its assessment criteria to determine the details of the project, the people who will be accessing the data, and what data will be requested. Applicants will need to be accredited researchers or agree to undertake training and become accredited, prior to accessing the data. Additionally, an applicant's organisation will need to provide evidence that they have information governance and security assurances in place. Members of the CLS Data Access Committee (DAC) will review and decide if the evidence provided satisfy the criteria requirements.

Applicants (licensees) and their organisations will have to sign two agreements to obtain a sub-licence, one with the UKDS and another with UCL. In both cases the licensee will agree with the terms stated in the Confidentiality Section of the UCL Licence Agreement and with the Confidentiality Terms stated in the Secure Access Agreement which will be signed with the UKDS. By signing these agreements, the licensee agrees to adhere to these terms, including respecting the privacy of health services user data they will receive. Licensees are also reminded of the penalties they are likely to incur if they do not comply with the terms they have agreed. In addition to the above, the UKDS agreement stipulates that data users must complete mandatory training before they are allowed to access the data.

To ensure the security of the linked information, shared with UCL by NHS Digital, and subsequently shared by CLS at UCL with the UKDS; where data could be accessed by approved researchers in a Secure Lab, UCL envisage the following controls employed at the different steps of the process of depositing, approving and sharing of the linked information:

-An agreement between NHS Digital and UCL to onwardly share linked HES/ ECDS and CLS MCS information under the "Sub-licencing model" which outlines the terms and conditions of use of the linked data via the UKDS Service Secure Lab as the data repository, and the full accountability of UCL (housing CLS at the UCL Institute of Education) to the actions of the parties involved in subsequent access to the linked data.

-An agreement between UCL (as a data controller) and UKDS at The University of Essex (as a data processor), which outlines the terms and conditions under which the linked data can be accessed via the UKDS Secure Lab.

-An agreement between UKDS and the approved researcher, which outlines the terms and conditions of use of the linked data in the UKDS's Secure Lab.

-An Agreement between CLS at UCL and the organisation requesting to use the linked data via the UKDS, which outlines the terms and conditions of use of the linked data.

The researcher accessing the data via the UKDS Secure Lab will not be able to download any data. Once the researcher has finished their research, the UKDS will destroy the data folder with the tailored dataset for the specific project.

The data held at UCL will be destroyed if the data sharing agreement between NHS Digital and UCL were to cease. If it were to cease, the license agreement between UCL and the licensee organisation will be terminated.

UCL legal basis for processing (acquiring, linking and sharing) personal data is for a public task under GDPR (article 6(1)(e)) i.e. processing is necessary for the performance of a task carried out in the public interest (as is made explicit to participants in the information leaflets provided). UCL also process special categories of personal data for research under GDPR (article 9(2)(j)) i.e. processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. In addition, for ethical reasons and under the Common Law Duty of Confidentiality, UCL sought permission from cohort members to access and link their routine health records to their survey data, and to the onward sharing of this linked data in pseudonymised form (via a secure setting with appropriate safeguards).

This data dissemination includes the following safeguards:
i. Linkages are based on informed consent
ii. Identifying variables are held separately to the survey responses, including during the matching process
iii. Data transfers are made securely e.g. encrypted
iv. The data will only be used for statistical research purposes and will not involve any decision making affecting a person
v. Data are stored in secure environments certified to ISO 27001
vi. Data are only accessed in pseudonymised form and treated for disclosure if necessary
vii. Data are accessed in a secure environment.
viii. Disclosure control checks are carried before any research publication.

All data processed under the sub-license will be completed using the same legal basis as mentioned above, namely GDPR (article 6(1)(e))and GDPR (article 9(2)(j)). The UCL Licence agreement will require licensees to provide the Legal Basis of their request to link health data via CLS and therefore CLS Data Access Committee (DAC) will only grant approval to applications from researchers within public bodies who have a legal basis to process data under GDPR. The data disseminated to UCL will be accessed by substantive employees of UCL who will work on the data to make it research ready and deposit it at the UKDS for researchers applying to use it for specific projects.

Yielded Benefits:

Below are examples of existing publications using the MCS data benefiting public health. Drinking in pregnancy In the age 3 survey, MCS cohort children completed activities to show which words they understood and spoke, and which colours, letters, numbers, shapes and objects they were familiar with. Parents were also asked about different aspects of children's behaviour, such as how well they got on with other children and how active they were. Research using MCS survey data have found that children whose mothers drank heavily while they were pregnant were more likely to have behaviour problems at age 3 than those whose mothers didn’t drink or drank lightly. On average they also did less well in the different activities, although lots of other factors are also important too. Smoking in pregnancy Several studies based on MCS have looked at how smoking during pregnancy relates to children’s development. One group of researchers found that babies with mothers who smoked at any point while they were pregnant weighed on average 146 grams less when they were born (around the weight of a smartphone) than babies with mums who did not smoke. Overall, the more cigarettes a mother smoked a day, the less her baby weighed at birth. Babies with mothers whose partners smoked around them while they were pregnant also weighed on average 36 grams less (about the weight of a chocolate bar) than those with mothers who were not exposed to smoke. Breastfeeding and child health An influential study found that babies who were breastfed in the first months of their lives were less likely to go to hospital for diarrhoea or respiratory problems, such as infections and pneumonia. The researchers estimated that half of hospital stays for diarrhoea, and a quarter of stays for respiratory problems, could be prevented every month if all babies in the UK were fed entirely on breast milk for at least six months. Breastfeeding and child development Between ages 3 and 7 MCS children took part in a range of activities to show which words they knew and the patterns they could identify in shapes and images. Studies have found that children who were breastfed tended to do better in these exercises and to have less behaviour problems. Research has also suggested that there is a relationship between breastfeeding and young children’s ability to coordinate the movements of their arms and legs and to reach milestones such as standing up for the first time and taking their first steps.

Expected Benefits:

MCS surveys include questions relating to health. CLS at UCL will use these responses to compare with their data available on HES/ ECDS to obtain a better understanding of relationship between self-reporting and administrative data. This will be shared via methodological information which will assess the data quality and comparability of two important data sources. This will benefit research looking at Health and Social Care.

This data linkage will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of cohort members. This could be of direct benefit to the NHS and to community services interfacing with schools through informing policy to improve healthy lifestyles.

The creation of this linked MCS- HES/ ECDS database will be a rich data resource with great potential for research on health and for informing policy interventions on health and social care.

Outputs:

Following the data quality and validation work, the first output will be the creation of the linked MCS- HES/ ECDS dataset. The HES/ ECDS data will add an important layer to this already rich data as well as providing the means for data quality checking.

The second output will be health-related research, alongside methodological papers on the linked dataset, published in peer-reviewed journals. The methodological assessments are expected to finish three years after obtaining the data. Outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide. CLS researchers doing research and/or methodological work will access this data via the UCL Data Safe Heaven.

The creation of this MCS- HES/ ECDS database and the research and methodological papers are the first steps in establishing a robust research database which will be of benefit to health and social care. This data linkage opens new research opportunities by combining reliable administrative data with detailed survey data. This linkage increases the number of variables available for research in the dataset and complements the health information provided by the participants in the survey. Combined, these data sources enhance each other, making it possible to capture detailed information regarding an individual’s health and wellbeing. Health events can be experienced over an extended period, tracking all relevant events over such a long period may not be feasible in a single database. Using HES/ ECDS data which records such information can improve the accuracy of the data collected in the survey, offering huge potential for scientific and policy-related research.

CLS at UCL actively promotes the use of their data among the research community through publications and events (e.g. training and workshops on each data set to help researchers better use the data), as well as providing extensive documentation, guidance on the use of the data and so ultimately benefit health and social care.

Processing:

1. CLS at UCL will supply NHS Digital with identifiers of cohort members who have consented to this data linkage, including full name, sex, postcode, date of birth, NHS number (if known) and study ID (study-specific pseudonymised identifier).

2. NHS Digital will link the identifiable study data to HES and ECDS data. NHS Digital will then remove identifiers from the linked datasets and return the pseudonymised datasets to the CLS team at UCL with the study ID. The data disseminated to UCL will be accessed by substantive employees of UCL who have been appropriately trained in data protection and confidentiality. The data will be held at the secure server in the UCL Data Safe Haven (DSH) and accessed remotely by CLS staff.

3. CLS will carry out validation of the administrative pseudonymised data received (linked HES/ ECDS data) and will combine the supplied administrative data with the information collected from the participant as part of the MCS study using the study ID.

Once the linked survey-administrative data files have been created, CLS may perform other activities to prepare the data for use such as coding and cleaning, derivation of summary variables and compilation of data documentation.

4. CLS researchers will use these data to create an analysis file which will not contain any identifiable data.

5. CLS will create derived variables that summarise study members' hospitalisation and health histories (e.g. hospital admissions and re-admissions, incidence of common diseases, children's ailments etc.), and will compare MCS survey data with data from hospital statistics, in order to compare and validate the data collected in CLS surveys.

CLS researchers who need to access the data to produce methodological papers on the quality of the data (eg around measurement, representativeness) and research papers helping to showcase its benefits for health and social care will need to submit an application to CLS DAC detailing their project proposal. Upon DAC approval, a pseudonymised dataset will be provided to the researcher. The data will be held at the secure server in the UCL Data Safe Haven.

Identifiers will be held separately from attribute characteristics. HES/ ECDS data will not be relinked to the identifiable data which is held separately from the survey responses. Re-identification will only happen at the occasion of a request, made from a cohort member, for withdrawal from the study, and this includes removal of data. Where a participant wishes to withdraw from the study, the identifiable data is used to locate the study id, and then in turn destroy their data.

UCL DATA SAFE HAVEN

The UCL DSH is certified to ISO 27001:2013 and is compliant with NHS Digital’s Data Security and Protection Toolkit. Research teams using the DSH complete annual training and regularly review data access arrangements ensuring data is only limited to those authorised to access it. UCL Computing Regulations are based on the premise that access to resources is generally forbidden unless expressly permitted. All data transfers from the DSH require approval and are carried out through secure portals which are fully audited. Access to the UCL DSH is via remote desktop and requires multi-factor authentication. In addition to a strong password each user has to use a six digit number generated by a smartphone app or physical token at each login. Passwords must be changed at regular intervals, and unused accounts are automatically disabled after a fixed period. Once inside the environment, robust access control ensures that researchers can only examine information that they are approved to use.

UKDS DATA ACCESS MECHANISMS:

As an ESRC resource centre, CLS at UCL shares its survey data with the research community via the UKDS under safeguarded or controlled access mechanisms, dependent on the likelihood and potential impact of disclosure. Data with higher risk of disclosure is treated with an appropriate degree of security and management. CLS data fall into the following categories which are defined by the likelihood and potential impact of disclosure:

-Tier 1: data with low level of disclosure: e.g. participant self-reported survey data. These data are made available through the UKDS End User Licence and have a low impact of disclosure;
-Tier 2a: data that is potentially disclosive: e.g. medium level and coarse geographies or sensitive information about cohort members. These data are made available through the UKDS Special Licence and have a medium impact of disclosure;
-Tier 2: data that are too detailed, sensitive or confidential to be made available under the standard End User Licence or Special Licence, such as detailed geographical indicators or fine-grained individual level linked data. These data have a high impact of disclosure and are made available through the UKDS Secure Access.

ACCESS MECHANISMS TO NHS DIGITAL HES/ ECDS DATA LINKED TO CLS COHORT STUDIES VIA THE UKDS:

The HES/ ECDS data provided to CLS at UCL by NHS Digital, which are linked to the CLS cohort members, have been processed by the CLS data management team to minimise the risk of disclosure when linked to the CLS survey data. This has been achieved by removing highly identifiable variables and altering other variables by top-coding or truncating them. Following this processing, the final health datasets have been classified under Tier 2.

It is UCL's intention to deposit these Tier 2 linked HES/ ECDS data with UKDS under the UKDS Secure Access, and provide access to this information for approved researchers, following the process and contractual arrangements, outlined above and described in more detail below, following the onward sharing model agreed between NHS Digital, CLS at UCL, and UKDS.

The data provided will be pseudonymised, and will be accessed only via the UKDS secure lab. Data accessed in this way cannot be downloaded. This means that researchers will have access to a screen view only and it is not possible to remove data from the environment. Once researchers and their projects are approved, they can analyse the data remotely from their organisational desktop, or by using the UKDS Safe Room. Specialised staff will apply statistical control techniques to ensure the delivery of safe statistical results.

ADDITION OF THE SUB-LICENCE:

The process of accessing the linked data via the UKDS Service Secure Lab include the following steps:
-Registration with the UKDS Service.
-Submission of an application, including an 'Accredited Researcher application form' and 'Research proposal'.
-Screening of application by UKDS for completeness.

Once a researcher has registered and UKDS has screened / approved the application:

1) UKDS sends project proposal (researcher application forms) to CLS at UCL for Data Access Committee (DAC) approval. Applicants are required to demonstrate they have security assurance in place (System Level Security Policy/ISO Certificate/DSPT).

2) CLS sends the “CLS Licence Agreement for linked NHS Digital Data” (henceforth referred to as “UCL License Agreement”) to researchers to be completed and signed by their organisation (this includes the benefits to health and social care and evidence of organisational security assurances) not covered by UKDS application.

3) Researcher/their organisation representative will send the UCL License Agreement for Linked NHS Digital data completed and signed back to CLS.

4) CLS will: a) check the organisational Information Governance and security assurance evidence provided as per Section 15 Organisational Security Assurance of the UCL Licence agreement, b) send the project for CLS DAC approval.

5) Should evidence of organisational Information Governance and security assurance provided not meet the requirements (as outlined in the sub-license), CLS will request the applicant to provide further evidence, and will only submit the project to CLS DAC for approval when evidence provided is satisfactory. Approval for data access will only be granted to applicant organisations that meet the security assurance requirement.

6) CLS DAC will assess both documents (UKDS project proposal + UCL Licence agreement) and make a decision to approve it, not approve it, or require further information. Should an application be rejected, a researcher can apply again with a revised application.

7) In the UCL License agreement, CLS DAC will, among other things, assess the benefits for health and social care statement and decide whether it is satisfied with the answer.

8) Once CLS DAC approves the project :
a) CLS will inform UKDS that the project has been approved.
b) CLS representative should sign the UCL License agreement noting the DAC reference number on the License document and send it back to the organisation of the applicant (the Principal Investigator for the study requiring access will sign the agreement - they will be an authorised signatory for their organisation).
c) CLS DAC will publish the information about any data dissemination on the NHS Digital release register, including the name of the organisation to which data was provided, purpose (summary of the project) and what data was released. (NB. If CLS DAC doesn't approve the project, no data will be disseminated).

9) If CLS DAC is not satisfied with the evidence provided by the applicant about the benefits to health and social care, then CLS DAC can ask the applicant to provide additional information and the project can be re-submitted for CLS DAC approval on the next CLS DAC meeting or via Chair approval.

10) UKDS will inform the researcher that their project was approved and make the data available to them via Secure access to linked data at the Safe Centre at the UKDS (hosted at the University of Essex) or via the researcher's own institutional desktop PC, depending on the sensitivity/impact level of the data being requested.

11) UCL will inform NHS Digital as to who they have issued sub-licences to, in a format agreed with NHS Digital.

Only staff who are permitted to work at the UKDS (and are substantively employed by University of Essex) will process the data for sub-licensing. Note that any data accessed through the UKDS Secure Lab can only be accessed under secure conditions and cannot be downloaded. The linked data provided to approved researchers may be subject to sub-setting of variables (and if necessary cases) to minimize disclosure risks and ensure that no individual or organisation can be identified from the results. In addition, all statistical outputs are subject to statistical disclosure control procedure. Access to the Secure Lab is only available to researchers who are based at a UK academic institution or an ESRC-funded research centre, and are an ESRC Accredited Researcher. PhD and research students can request access but must apply jointly with their supervisors from established organisations.

No further onward sharing can occur beyond the sub-licence.

UKDS SECURE DATA HANDLING PROCEDURES:

UKDS has received government technical accreditation and has been certified for its secure data handling procedures under the international standard for information security (ISO 27001). To maintain this certification, regular internal and external audits are undertaken. UKDS also hires a government-approved company to conduct internal and external penetration testing of its Secure Lab systems.

More widely, the UKDS employs an Information Security Management System (ISMS) to ensure compliance with the ISO accreditation. The Secure Lab falls into this system, and a number of documented processes are regularly maintained and reviewed to ensure these processes are robust, relevant, and fit-for-purpose. The ISMS is overseen by an Information Security Management Group (ISMG), which regularly meets and approves changes to procedures.


All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by 'Personnel' (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).


Research on Health and Ageing using English Longitudinal Study of Ageing (ELSA) — DARS-NIC-30493-Y0C0K

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, Identifiable (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2019-02-21 — 2022-02-20 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: NATIONAL CENTRE FOR SOCIAL RESEARCH, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Accident and Emergency
  2. Hospital Episode Statistics Admitted Patient Care
  3. Hospital Episode Statistics Critical Care
  4. Hospital Episode Statistics Outpatients
  5. MRIS - Cause of Death Report
  6. MRIS - Cohort Event Notification Report
  7. MRIS - Flagging Current Status Report
  8. MRIS - Members and Postings Report
  9. Hospital Episode Statistics Accident and Emergency (HES A and E)
  10. Hospital Episode Statistics Admitted Patient Care (HES APC)
  11. Hospital Episode Statistics Critical Care (HES Critical Care)
  12. Hospital Episode Statistics Outpatients (HES OP)

Objectives:

The English Longitudinal Study of Ageing (ELSA) is a well-established, on-going, multi-disciplinary cohort study involving a collaboration between University College London (UCL), the Institute for Fiscal Studies (IFS), the University of Manchester (UoM), and NatCen Social Research (NatCen).

Since its inception in 2002 it has provided valuable insights into a range of social, health and economic issues. Traditionally, data have been collected biennially face-to-face via interview and clinical examination. While this approach has been very useful and will continue, linkage of study members in ELSA to routinely-collected data offers not only additional rich, complementary information about their health which cannot be gathered using these methods (e.g., valid data on diagnosis and prognosis of common chronic diseases such as cancer and depression) but, crucially, data which come at no burden to the study members. Participants are invited to re-consent every 2 years (known as Waves) with Wave 8 recently completed and Wave 9 due to start in Summer 2018.

The Department of Epidemiology and Public Health at University College London (UCL) require linked pseudonymised Hospital Episode Statistics (Admitted Patient Care, Outpatient, Critical Care and Accident & Emergency) and Cancer registration data as part of their research obligations as part of the ELSA research group. This agreement will permit NatCen (under NIC-311182-N0L1Y subject to an active DSA and supporting purpose) to share pseudonymised linked HES, and Cancer data in order for UCL (under this agreement) to carry out their obligations.

The requested data will be used for a programme of research on health and ageing in England. This is a long-standing and on-going programme of work which aims to improve understanding of the ageing process, and how the use of health care affects this ageing process and the evolution of health over the lifecycle.

Linking NHS Digital data with ELSA will allow UCL to combine detailed information on health outcomes; the use of hospital services; the quality of health care and the identification of trends in health that will impact on future demands for health care with wider characteristics of the elderly population. The proposed linkage of ELSA to administrative health data will provide novel data for research on ageing in England. Existing studies on ageing, and in particular the use of health and social care services of individuals as they age, has been restricted by extremely limited data on the use of these services. Studies on the evolution of health at older ages using administrative health records has also been limited by a lack of information on the socio-economic and wider health characteristics of individual. Linking the data together therefore provides a rich dataset which enables research in this crucial policy area.


The work will be carried out by researchers at UCL and is funded by research grants from the National Institute on Aging in the USA, and the Economic and Research Council in the UK. There is a team at UCL who carry out research on ELSA as well as manage the study, and this is funded by the MRC, Cancer Research UK and other funders as well as by NIA and the ESRC. This request is to allow that team to carry out this research.

These new data will be used to continue UCL’s programme of research on ageing and health in England which aims to improve understanding of the ageing process, including predictors of various age-related disease diagnoses (e.g., heart disease, stroke, specific cancers, depression, dementia), and how health and the use of health care affect that ageing process. This will have value in planning services, predicting future needs, estimating the costs of care, and understanding the impact of various ageing states on individuals and their families. All analyses will use de-identified data.

Specifically, the data will improve understanding of:
1) Variation in hospital use across individuals with different individual and family characteristics, particularly at the end of life, providing a new understanding of the extent to which spending is efficiently allocated across different types of people.

2) The relationship between social care provision and hospital use, in particular the extent to which lack of social care availability may increase use of hospital care either through more entry into hospital or delayed exit.

3) Social inequalities in health among older people in England, and the relationship between these inequalities and other social characteristics, quality of care, and disability. This work is important for future planning of health care provision.

4) New funding from the National Institute on Aging has charged UCL explicitly to model the incidence of dementia in the ELSA population, and this will be greatly facilitated by the availability of hospital care statistics. At present, UCL’s estimates of dementia incidence and prevalence are based on cognitive tests and ratings from relatives and carers. Having information about the use of hospital services by ELSA participants with dementia and cognitive impairment will strengthen the evidence base and provide a platform for more detailed analyses of the determinants and consequences of dementia. UCL will also be able to fulfil the mandate from the National Institute on Aging to provide data that can be used to compare dementia rates in the UK and USA.


UCL will use the pseudonymised linked data for research on ageing to understand what factors influence survival, and whether attrition from the repeated measures in the study has happened because of drop-out or because they died.

Another objective of this research project is to compare the risk of mortality following the onset of different health conditions across demographic and socioeconomic groups within the older population in England, and between similar groups in England and the USA.


1) The cancer registration data is specifically required for projects funded by Cancer Research UK. These will address the following issues: The relationship between body weight, changes in body weight, and cancer incidence among older people.

2) The impact of cancer diagnosis on health behaviours and quality of life. Up to now, UCL have based these analyses on self-reported diagnoses (Br J Cancer, 2013; Psycho-Oncology, 2016). But registry data will allow these issues to be investigated with greater precision.

3) The association between bowel cancer screening (measured in ELSA since 2012) and cancer incidence.

4) The relationship between psychosocial factors (depression, social isolation, cognitive function) and cancer incidence.


Other projects that will take place as part of this programme of work are:
(1) To examine how the pattern of hospital care use changes in the final year(s) of life, and to examine whether it is proximity to death, as opposed to age, that determines healthcare utilisation (controlling for other characteristics captured in the ELSA data).

(2) To compare the risk of survival following the onset of different health conditions across demographic and socioeconomic groups within the older population in England, and between similar groups in England. UCL will use the information on cause of death to find out who has had an onset of a condition prior to their death, so that UCL can work out the probability of survival among those who experience (e.g.) a heart attack. UCL have missing survey information on those who die before they are able to report a new onset, and the cause of death information allows UCL to fill in the gap.



The requested data would be used solely for research purposes, in accordance with the research aims stated above. UCL would model the use of NHS-funded inpatient services (provided by HES) by individuals with similar underlying health needs, but who differ in other, non-need based characteristics (provided by ELSA). UCL then model the relationship between receipt of social care and hospital admissions to examine whether cuts in social care are likely to increase probability of hospital admission.

Yielded Benefits:

The data were only received in Summer 2018, and as of yet no work has been published. As a result, this work has not yet yielded any of the expected benefits. It is expected that publication will be produced from Autumn 2019 onwards, with benefits to follow after this.

Expected Benefits:

The twin pressures of a rapidly ageing population and a prolonged period of public spending austerity will produce unprecedented pressures on NHS services over the coming years. The English population aged 65 and over is expected to grow by more than 20% over the next decade. Meanwhile, the NHS is experiencing a period of funding austerity, with little increase over the past few years. Understanding how to meet these additional demands with fewer resources is therefore a key challenge for health policymakers and practitioners. The importance of this challenge is reflected in the recent policy and practice debate (e.g. the Better Care Fund), and the size of the challenge has been well documented by the Dilnot Commission and initiatives such as the Quality Innovation Productivity Prevention (QIPP) programme. It also highlights the importance of prevention rather than cure, and the crucial role played by lifestyle and behaviour in more effective prevention.

Existing work on ELSA has been used to inform policy makers including Monitor, NHS England, the Department of Health, the Cabinet Office and representatives from PCTs. This type of academic work helps to understand the impacts of former policy and guides improvements to the existing health and social care system. The DH has noted that they have “have no doubt that the linkage of the Hospital Episode Statistic with survey data from the English Longitudinal Study of Ageing will be a valuable source of information in understanding the variation of health care use across individual with similar medical needs but different characteristics”.


The data linkage between HES and Cancer Registration data with ELSA would provide an important contribution to this debate. ELSA has high quality data on the social circumstances of a large representative sample of older people living in England, very precise economic data on wealth and income, a broad range of psychological factors, information about health and disability, lifestyle factors relevant to health including physical activity, diet and cancer screening participation, objective measures of cognitive function, physical capacity, health-related biomarkers, and genetic data. A particular strength of ELSA is that it is a longitudinal study involving data collection every two years since 2002. This allows trajectories of social, economic, psychological and biological processes to be tracked over prolonged periods.


The linkage would provide detailed information on the characteristics of individuals who use health and social care services and who experience major health problems. This will allow a detailed analysis of who uses these services, and to identify any spillovers in the use of health and social care (e.g. do cuts in social care spending have negative impacts on NHS services). In particular, the ability to follow the same individuals over an extended period of time will provide information on how needs for (and use of) health and social care have changed over across cohorts. This will contribute directly to an important debate over the size of additional pressures on services as a result of an ageing population (e.g. does ‘healthy’ ageing lead to increased health spending?). UCL will also be able to strengthen the evidence base for the importance of maintenance of healthy lifestyles into older ages. Some people believe that once they have reached 60 or 65 years old, then sustained physical activity, healthy dietary choices and other lifestyle factors no longer matter because they have already done their damage. UCL research with other datasets indicates this is not the case, but ELSA can provide even more convincing evidence because of its representative sample and long follow-up, and will help to drive the prevention agenda in order to reduce health and social care costs among the elderly.

Outputs:

The number of publications using ELSA data is substantial (more than 200 at time of writing), and ELSA researchers have given numerous presentations, talks and seminars at government seminars, academic conferences, and policy workshops. Other outputs will include presentations at academic conferences and presentations to policy makers. Presentations with policymakers will focus on disseminating results, and helping to inform the government departments who are involved in planning and delivering NHS care to elderly individuals. All outputs will only report large sample aggregate statistics and regression outputs and small numbers will be suppressed in line with the HES analysis guide. No individual or episode level data will ever be published.


Two types of written output are expected:
(i) Articles submitted to peer-reviewed journals. Previous results have been published in high impact general medical and scientific journals (Lancet, British Medical Journal, Journal of the American Medical Association, Proceedings of the National Academy of Sciences USA), and in specialist journals in epidemiology, public health, and social science.
(ii) non-technical research summaries which will be press-released and target policy makers, such as the Department of Health
and NHS England.

Processing:

As specified, ELSA consist of NatCen, IFS, UCL, and UoM working in collaboration with NatCen being the lead organisation. NatCen are the holders of the ELSA cohort and thus only NatCen hold the identifiable data in association with this cohort. IFS and UCL only have access to a pseudonymised version of the ELSA data with only the pseudonymised study ID as a form of identifier. For the purpose of this agreement UCL will obtain pseudonymised data from NatCen directly.

The data shared by NatCen contains both ELSA data (pseudonymised) and data provided by NHS Digital under NIC-311182-N0L1Y (also pseudonymised). The NHS Digital data shared will be restricted to the fields and pseudonymisation specified in this agreement and in that of NatCens agreement under NIC-311182-N0L1Y.

The data received from NHS Digital will be converted by NatCen into a pseudonymised format before onward sharing to UCL by removing identifying data. Date and Birth, Date of Death, Date of Inquest, and Date of Registration will be converted to MM/YYYY format. Cancer registration number will be downgraded to the first 6 digits. Only pseudonymised data will be shared with UCL and UCL may only receive, process and retain the data with an active NHS Digital Data Sharing Agreement in place.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

No data will be shared outside of the ELSA research group. Each organisation requesting access to the data will be required to hold an active agreement with NHS Digital.

All persons accessing the data are direct employees of UCL, and who are named ELSA collaborators.

The Data will only be used for the purposes described in this agreement.

UCL do not require identifiable data nor will they attempt to re-identify this data. The data will not be linked to any other dataset.

The UCL team require the earliest data (from 1997/98 or as far back as possible) on all admissions, outpatient appointments, critical care and A&E attendances at NHS hospitals in England. The youngest ELSA cohorts were born in the 1950’s and that coupled with information UCL have about early life experiences allows UCL to predict both the risks and consequences of hospital usage over time and across much of the life course. The statistical power of the research will be greatly enhanced by capturing as many years of hospital usage as possible and will enable the UCL team to carry out their proposed research in a statistically robust manner.


When turning the supplied data to outputs UCL will be doing the following:

(1) Produce hospital utilization measures
(2) Produce death outcome measures
(3) Produce cause of death measures
(4) Use (1), (2) and (3) to run regressions, correlations and produce descriptive statistics to further our understanding of the relationship between these and the ageing process.
(5) Create indicators of hospital admissions at the local authority level


Dose-response relationship between alcohol and suicide — DARS-NIC-287229-D6F9F

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2019-08-27 — 2022-08-26 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Adult Psychiatric Morbidity Survey
  2. Adult Psychiatric Morbidity Survey (APMS)

Objectives:

University College London (UCL) requires Adult Psychiatric Morbidity Survey (APMS) data to investigate if there is a dose-response relationship between weekly alcohol consumption and suicidal/self-harm behaviours in the general population.

A link between alcohol misuse and suicidal behaviours is well-established. Alcohol use disorder (AUD) and acute use of alcohol prior to attempt (AUA) are particularly significant risks to suicidal behaviours. There is little research exploring the relationship between alcohol consumption on self-harm and suicidality at a population-level with previous research showing inconsistent results.

The data is being processed in line with EU GDPR Article 6 (1) (e) “for the performance of a task carried out in the public interest” and EU GDPR Article 9 (2) (j) “processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1)”. Approximately 800,000 people die by suicide worldwide each year with approximately 6,000 suicidal deaths in the UK. It is in public interest to conduct research to identify trends and risk factors for suicidal and self-harming behaviours to inform policy and practice and potentially inform interventions for these at-risk groups to lessen the societal burden of suicide and self-harm. Participants have willingly participated in the Adult Psychiatric Morbidity Survey (APMS) knowing their data will be used for research purposes and consenting to this. All APMS 2014 data is pseudonymised, protecting the identities of participants.

The APMS data will achieve the aim as statistical analyses (logistic regression) will be carried out to test the association between alcohol use and self-harm and suicidality. Alcohol use has been collected in APMS using the AUDIT score, including specific questions on units of alcohol consumed and drinking days. A weekly measure of alcohol consumption will be created based on the responses to these questions. Total AUDIT scores will also be used to analyse the relationship between alcohol consumption and self-harm & suicidality.

This research will be undertaken as part of a PhD research on alcohol and substance use as a risk factor for suicide and self-harm across the lifespan, funded by the NIHR School for Public Health Research. This will be the first in a series of studies unpacking the relationship between alcohol consumption and self-harm/suicidality at a population health level. Depending on the results observed in this study, it is expected that future longitudinal research will be conducted using UK GP records (THIN/CPRD).

The data subjects for this study are UK residents aged over 16 years living in private households who have completed the APMS survey. There is no control group.

The APMS data is a nationally representative sample of the UK general population. Due to the rare outcomes of self-harm/suicidality UCL need a sufficiently large sample to get meaningful results for the research question, so the full 2014 APMS sample is required. Individuals’ responses within APMS are pseudonymised. No linkages are requested at this time.

University College London (UCL) are the sole Data Controller who also process data.

NIHR School for Public Health Research are funding the PhD project for which this research project is being conducted.

Outputs:

The research paper will be submitted to an academic peer-reviewed journal specialising in psychiatry/public health, such as the BMJ or Journal of Affective Disorders.

The PhD thesis will included the research results. The thesis will be uploaded and freely available via http://discovery.ucl.ac.uk/ upon completion of the PhD.

The aim is for the results to be presented at an academic conference with an interest in suicide/self-harm/mental health. Results will be presented at the School for Public Health Research Annual Scientific Meeting in March 2020 and will be submitted to the ECR/MCR Suicide and Self-Harm Research Forum and the European Symposium on Suicide and Suicidal Behaviour in September 2020.

If results of interest emerge from the study, a public health resource may be created to inform clinicians and policy makers. This will include recommendations for practitioners, policy makers and the public with respect to the improved knowledge of the links between alcohol consumption and self-harm/suicidality. The public resource summarising the outputs and key findings should assist clinicians with the recognition of at-risk individuals and populations so they can be offered the necessary support and appropriate targeted interventions can be designed for these groups. The core research team work clinically in primary and secondary healthcare settings, and have direct links with the NIHR School for Public Health Research and Public Health England. The research team will work with these organisations to plan the distribution and dissemination of this work to maximise its impact, while being mindful of the sensitive nature of this subject.

Results will also be broadcast through Twitter. Relevant organisations, such as Samaritans, Addaction and Rethink Mental Illness, will be contact through both Twitter and by email to make them aware of the results and they will be encouraged to broadcast results through their mediums to increase public awareness of the research findings.

Outputs to be included in the peer-review journal is expected to be submitted by January 2020. The expected date for presenting at a conference is by March 2020.

APMS low numbers and suppression
In order to protect patient confidentiality in publications resulting from analysis of APMS data users must:
• guarantee that any outputs made available to anyone other than those with whom this agreement is made, will meet required standards, including the guarantee, methods and standards contained in the Code of Practice for Official Statistics (http://www.statisticsauthority.gov.uk/assessment/code-of-practice/index.html) and the ONS Statistical Disclosure Control (https://gss.civilservice.gov.uk/statistics/methodology-2/statistical-disclosure-control/) for tables produced from surveys;
• apply methods and standards specified in the Microdata Handling and Security Guide to Good Practice (http://www.data-archive.ac.uk/media/132701/UKDA171-SS-MicrodataHandling.pdf) for disclosure control for statistical outputs.

Processing:

This request is for the APMS 2014 dataset to flow out of NHS digital. Some of this relates to health-related data given that some screening questionnaires were used in the conduct of the survey. Once this agreement is active the actual flow of data will be from the UK Data Service (UKDS) who will grant system access to pseudonymised APMS data to the University College London. There are no subsequent flows of data.

Data will be stored and upheld in line with UCL’s Data Protection Policy. It will be marked as highly confidential and only be accessed by the principal research team through the UCL Safe Haven source. The principal research team consist of a PhD student and senior researchers (including professor, senior clinical lecturer). All are substantive employees of UCL.

Logistic regression will be carried out on the data for the main statistical analyses. All statistical testing will be conducted using Stata version 16. All files, including coding instructions, will be deleted upon conclusion of this agreement.

All APMS participants data will be included in the research. No linkage to other sources will be conducted. APMS Data are pseudonymised.

There will be no requirement or attempt to re-identify individuals.

The 2014 APMS dataset (English adult population (aged 16 and over) is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole dataset; there is no facility to select individual variables.

UCL will be able to download the dataset from UKDS for the period specific within the data sharing agreement (DSA) and they must securely destroy all local copies of the dataset when the DSA expires and notify the Data Access Request Service (DARS) in line with standard procedures. This 2014 version of the dataset available via DARS has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).


Examining loneliness in people with borderline intellectual functioning compared to the general population and its relationship to mental and physical health outcomes — DARS-NIC-177523-N8J2S

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 – s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2018-06-26 — 2021-05-25 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Adult Psychiatric Morbidity Survey
  2. Adult Psychiatric Morbidity Survey (APMS)

Objectives:

The objective is to use the APMS 2014 dataset for the purposes of research (MSc research project). A secondary analysis of the data will be carried out by researchers at UCL in order to investigate how loneliness may affect people with borderline intellectual functioning.

Background
A prevalence of loneliness of 10.5% has been reported in the general population.
Loneliness has been associated with being female but there appears to be a complex relationship with age, with some studies reporting a U shaped relationship with higher levels of loneliness in younger and older people, or higher levels in older age. Other socio-demographic factors associated with loneliness include being single, living alone, low education and income, immigration status and low social support. Loneliness has been associated with life style factors such as smoking, being less physically active and lower consumption of fruit and vegetables. Loneliness is associated with increased mortality and higher rates of chronic diseases such as raised blood pressure and cholesterol and chronic heart disease. Loneliness has also been linked to depression and higher levels of psychological distress, suicide and psychosis.

The prevalence of loneliness in people with intellectual disability (ID) has been reported to be 44.7%, which is thought to be higher than the general population. Loneliness in people with ID has been associated with increasing age, living in a large residential setting, with lower levels of loneliness being associated with having choice of living companions or living with family. Loneliness was also associated with being afraid at home and the neighbourhood (but liking where you live was associated with less loneliness. In addition, social contact with friends and family was associated with less loneliness. Studies on the association between loneliness and mental health problems is limited. However, one study did find an association with depression. Not feeling lonely has been associated with better physical health.

However, little is known about the prevalence, risk factors and outcomes associated with loneliness in people with borderline intellectual functioning. Borderline intellectual functioning is generally defined as having an IQ score between 70-85. This group has increased vulnerability to social disadvantage and mental health problems.

The APMS dataset has not previously been used to explore loneliness in this group.

The aim of the study is to examine loneliness in people with borderline intellectual functioning and compare their physical and mental outcomes to the general population. The specific objectives are to:
1. Compare the prevalence of loneliness/social support in people with borderline intellectual functioning and the general population
2. Explore the association between loneliness/social support and socio-demographic variables (age, sex, ethnicity, qualifications, income, employment, accommodation and neighbourhood characteristics) separately in people with borderline intellectual functioning and the general population to explore similarities and differences in the associations.
3. Explore the relationship between loneliness/social support and wellbeing, common mental disorders (depression and anxiety disorders) and chronic physical health conditions separately in people with borderline intellectual functioning and the general population in order to identify similarities and differences in the associations
4. Does loneliness/social support moderate the relationship between intellectual functioning and mental disorders (anxiety, depression etc.), chronic physical disorders and wellbeing?

The data will be analysed only within University College London. It will not be used to support a larger programme of work.

Expected Benefits:

The research study will further the understanding of the mental and physical health impacts of loneliness in people with borderline intellectual functioning.

The results will be disseminated to commissioners and mental health charities (e.g. MIND), including befriending organisations that support people who are lonely or those who have limited or no social support (target date 01/06/2019). Tackling and reducing loneliness may lead to improvements in physical and mental health outcomes and therefore the research findings could help to promote the role of befriending and volunteering organisations and provide evidence for the need to lead to develop interventions to reduce loneliness in this group (and other disadvantaged groups).

Outputs:

The results of the study will be of interest to mental health practitioners and mental health services that encounter people with borderline intellectual impairment. The study will raise awareness of the issues experienced by people with borderline intellectual functioning and how their needs should be better met. The findings of the study will be published in a peer reviewed scientific journal such as the Journal of Intellectual Disability Research and presented at conferences (e.g. the International Association for the Scientific Study of Intellectual and Developmental Disabilities). Personal identifiable data will not be published. Outputs will only contain aggregate level data.

The target data for publication within a scientific journal is 02/01/2019.

In order to protect patient confidentiality in publications resulting from analysis of APMS data users must:
· guarantee that any outputs made available to anyone other than those with whom this agreement is made, will meet required standards, including the guarantee, methods and standards contained in the Code of Practice for Official Statistics and the ONS Statistical Disclosure Control for tables produced from surveys;
· apply methods and standards specified in the Microdata Handling and Security Guide to Good Practice for disclosure control for statistical outputs.

Processing:

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

The 2014 APMS dataset is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole dataset; there is no facility to select individual variables. They will be able to download the dataset from UKDS for the period specific within the DSA and they must securely destroy all local copies of the dataset when the DSA expires and notify DARS in line with standard procedures. This 2014 version of the dataset available via DARS has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

Once an active data sharing agreement is in place, UKDS will transfer the pseudonymised APMS data to UCL. It will be transferred and accessed within the Data Safe Haven. This is UCL's data service for storing, handling and analysing identifiable data. It has been certified to the ISO27001 information security standard and conforms to NHS Digital's Information Governance Toolkit.

The data obtained will be fully anonymous. It will be stored directly and processed only using UCL Data Safe Haven, which uses encryption and is therefore very secure. The data will only be accessed by substantive employees of UCL. Registered UCL MSc students will only have access to aggregated data with small numbers suppressed. No data will be linked to record level patient data.

Justification for processing the data:
The data will be processed according to article 6(1) e – legitimate interest under “public task”.
UCL is a public authority and therefore the legitimate interest for processing data is under “public task”. Processing data for the purposes of research is considered to be one of UCL’s public tasks. The processing of the APMS dataset is considered necessary as there are no other means of examining the objectives in a less restrictive way. Individuals will not be harmed through the processing of the data.
In addition, data will be processed according to article 9(2) j – processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes.
UCL ensures that the processing meets the public interest test and appropriate safeguards are in place such as using “technical and organizational measures” to ensure minimisation e.g. pseudonymisation and not processing in a way that will cause damage or distress to individuals.

Method
A secondary analysis of data will be conducted using The Adult Psychiatric Morbidity Survey, 2014. This is the fourth and most recent survey of adult mental health in the general population. It comprised two phases, an initial interview with the whole sample and a second phase interview that was conducted with a sub-sample of phase one participants by clinically trained interviewers coordinated by the University of Leicester.
The survey employed a multi-stage stratified probability sampling design. The sampling frame was based on the small user Postcode address File (PAF), which permitted private households to be indentified. The primary sampling units (PSU) were individual or groups of postcode sectors. The PSUs were stratified by a number of different strata and a random sample was obtained from this list. Addresses that did not contain private households were excluded. One person over the age of 16 was randomly selected to take part in the survey per household.
13313 individuals were contacted but 7546 participants completed the survey (57 % response rate).

Measures
1. Measuring intellectual functioning
Intellectual functioning will be assessed using the National Adult Reading Test (NART), which is a standardized test designed to estimate the premorbid intelligence level of adults. The NART consists of 50 English words presented in ascending order of difficulty (Nelson & Willison, 1991). The NART error score is calculated from the total number of reading errors made by the candidate and this is used to estimate a verbal IQ score. An IQ between 70-85 will be used to identify the sample of participants with BIF and those with an IQ greater than 86 will be identified as being in the general population (variable name: iqbest2g) . The sample will be further defined by excluding participants who have educational qualifications of A-levels or higher.

2. Measuring loneliness and social support
Loneliness will be measured using one item “I feel lonely and isolated from other people”. A four point likert scale was used (very much, sometimes, not often and not at all).
Social support will be measured using the total score from 7 items measuring social support, which includes items such as “there are people I know amongst my family and friends who make me happy”, “there are people I know amongst my family and friends who can be relied on no matter what” and “there are people amongst my family and friends who give me support and encouragement”. These questions were rated on a three point scale (not true, partly true, certainly true).
In addition, we will use one item using a variable that has been derived from the Social support scale that examines the number of family and friends the respondent feels close to (variable name: Primgrp)

3. Wellbeing
Well being will be measured using the total score on the Warwick- Edinburgh mental wellbeing scale (WEMWBS). This is a 14-item scale with five response categories, providing a total score ranging from 14–70. The items are all cover both feeling and functioning aspects of mental wellbeing. A higher score
indicates a higher level of mental wellbeing.

4. Common mental health disorders
The Clinical Interview Schedule Revised (CIS-R) was used to identify the presence of common mental disorders.
The following will be examined: Participants who were diagnosed with depression in the past 12 months, participants who were diagnosed with phobia in the past 12 months, participants who were diagnosed with panic attacks in the last 12 months and participants who were diagnosed with Post Traumatic Stress Disorder in the past 12 months . The overall CISR score (variable name CISR Two) and participants who were diagnosed and treated with any common mental health problem in the last 12 months will also be examined.
Suicidal thoughts in the last 12 months will be analysed.

5. Physical health disorders
One item about general heath will be used: “How is your health in general?”. This item is rated on a 5 point Scale (excellent, very good, good, fair or poor).
Participants were presented with a list of 22 physical conditions and were asked whether they had ever had any of these conditions; whether they had the condition in the past year; whether the condition had been diagnosed by a health professional and if they received any medication or other treatment for it. The presence of any chronic disease (e.g. asthma, diabetes, epilepsy, high blood pressure, cancer) in the last 12 month will be examined as well as individual disorders.

6. Socio-demographic variables
The following socio-demographic variables will be analysed: age, sex, marital status, ethnicity, income, any educational qualifications, employment (paid work in the last 7 days (wrking), ever had a job, accommodation.
Whether people feel safe in their neighbour hood will also be examined using 1 item: “ I feel safe around here in the day time”. This item was measured on a five point scale (strongly agree to strongly disagree).

Analysis
Stata will be used to analyse the data and sampling weights will be applied to all the analyses. Descriptive statistics will be used to describe the sample (proportion of people with borderline intellectual functioning, the number of males and females, mean age and ethnicity in both groups (general population and borderline intellectual functioning). The proportion of people reporting loneliness will be compared in people with borderline intellectual functioning and the general population. Data will be presented as weighted percentages and Chi Square tests/ T tests will be reported, where appropriate.

Subgroup analysis will be carried with both groups to identify the relationship between loneliness (dependent variable) and socio-demographic variables, mental health and chronic physical disorders.
The moderating effects of loneliness on the relationship between intellectual functioning and chronic mental health and physical disorders will be analysed.


MR104C - British Women's Heart and Health Study (s251 cohort) — DARS-NIC-174486-Q8J1B

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 – s261(7)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2018-06-22 — 2020-03-31 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - Cause of Death Report
  2. MRIS - Cohort Event Notification Report
  3. MRIS - Flagging Current Status Report

Objectives:

University College London (UCL) currently holds mortality data, cancer registration data and demographic data for use in the British Women’s Heart & Health Study (BWHHS).

The BWHHS has previously been managed by both the University of Bristol and the London School of Hygiene and Tropical Medicine but since 2015 has been managed by UCL at The Institute of Health Informatics. UCL is now the sole Data Controller and, for the primary purpose of the BWHHS, UCL is the sole processor of the data.

The University of Bristol and the London School of Hygiene and Tropical Medicine have no ongoing role in the study other than individuals from the respective organisations working as study collaborators with UCL in accordance with the arrangements and processes described below.

The BWHHS aims to determine the contribution of both established and new risk factors to the considerable variation in ischaemic heart disease and stroke in Great Britain. It is also concerned with the effects of risk factor changes and their impact on Cardiovascular Disease (CVD) events and on other common causes of morbidity and mortality in British women. The present aim is to continue to collect CVD and other common diseases of incident morbidity and mortality to provide information for the prevention and promotion of a disability-free life in older women.

The BWHHS is a prospective cohort study of cardiovascular disease in women aged over 60 years, in England, Scotland and Wales.

The study was set up in 1999 to complement the British Regional Heart Study (BRHS), to describe and establish risk factors and the differences in their impact in women compared to the men followed up by the BRHS.

The study selected women at random from 24 GP practices, in 23 towns from 1999 to 2000. Of the 7,296 invited, 4,286 (60%) were recruited and attended the baseline examinations and completed questionnaires. Follow up consisted of postal questionnaires and regular reviews of GP medical records. This agreement covers the release of data in respect of 1049 participants of the study where the legal basis to address the Common Law Duty of Confidentiality (CLDC) is under Section 251.

The BWHHS has used the patient tracking service provided by NHS Digital and predecessor organisations to receive notifications of its cohort member’s deaths (date and cause), cancer registrations, exits from the NHS and changes in recorded demographics (such as name, NHS number, etc.). The latest demographics data has been used for administrative purposes to support ongoing contact with participants (e.g. for sending questionnaires) and for requesting primary care data from GPs.

The BWHHS was set up to explore the current patterns of CVD & Cardiovascular Heart Disease (CHD) risk factors (and recent changes in this pattern), and prevention and treatment for CHD in older British women.

The BWHHS involves multiple ongoing analyses and investigations testing different hypotheses within this scope. The research activities are determined by the BWHHS study director on a monthly basis. The BWHHS study director is responsible for determining what analyses will be undertaken and what data will be used for each analysis in support of the objectives. All analyses are undertaken by members of the BWHHS team, all of whom are UCL employees working under supervision of the study director.

The clinical outcomes that are the current focuses of the BWHHS are: myocardial infarction, stroke, angina, heart-failure, diabetes, deep vein thrombosis and pulmonary embolism (DVT/PE), dementia, atrial fibrillation and cancers.

The main focus is the measurement of biological variables (biomarkers) using the biological specimens collected at baseline (in the form of DNA, plasma, and serum samples) and their effect on clinical diseases of relevance to post – menopausal women such as cardio-metabolic disease, dementia and common cancers. The continuing provision of mortality and cancer registration data increase the number of events accrued over time, which will increase the statistical power to BWHHS’s analysis.

Under this Data Sharing Agreement, the BWHHS team is not permitted to use the data to expand the focus of the BWHHS to non-cardiovascular conditions of relevance to post-menopausal women that the BWHHS currently does not collect.

Additionally UCL has made data from the BWHHS available to third parties subject to an approval process outlined below. NHS Digital has assessed what data is shared and determined that it is not sufficiently derived and therefore may not be onwardly shared without NHS Digital’s express permission.

Under this Data Sharing Agreement, UCL is not permitted to share any data supplied by NHS Digital or data derived from that data with any third parties or with UCL employees for any purposes other than the primary objectives of the BWHHS.

UCL has informed NHS Digital that it has previously shared pseudonymised data with the following organisations:
1. University College London – for the purpose of a UCL-LSHTM-Edinburgh-Bristol (UCLEB) Consortium
2. UCL investigators at the Department of Medicine
3. University of Bristol
4. University of Cambridge
5. London School of Hygiene & Tropical Medicine
6. Birkbeck, University of London
7. Universidad de Salamanca
8. University of Alcala
9. Imperial College London

No new data may be shared with these third parties. Under this Data Sharing Agreement, the above organisations are permitted to retain the data while UCL prepares and submits an application to NHS Digital for approval to onwardly share data under appropriate controls. The third parties are not permitted to onwardly share the data and may only use it for the purposes of the projects that were previously authorised by the BWHHS Study Director following the approvals process outlined in the ‘Processing activities’ section below. No changes to the existing purposes may be approved.

Data will only be used for academic research purposes and not commercially.

Findings from data have been used to inform policy and clinical guidelines and have also led to the development of prediction models available as web tools. Work conducted operates in the pre-competitive arena so does not contain patentable or commercially exploitable results.

UCL have identified the appropriate legal basis for processing under General Data Protection Regulation (GDPR). Based on the purpose for processing, the legal basis is 'processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.' Article 6(1)(e). As the research involves health data, which is included in the definition of special categories of personal data, and requires an additional condition for processing. For health research this will be Article 9(2)(j) 'processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law...'.

Yielded Benefits:

The BWHHS has 218 publications within peer reviewed journals to date addressing the aims outlined previously in this application. Many of the journals for BWHHS manuscripts are considered high ranking journals. This “ranking” of journals is measured by something called the ‘impact factor’ which reflects the frequency with which the average article in a journal has been cited within the year. 31% of the publishing journals for BWHHS manuscripts have been published in journals with a high-impact factor (>8). This means that the output is published in journals that are widely read by doctors, scientist and public health practitioners, ensuring a greater impact of the work. The BWHHS has also had measurable benefit directly as fourteen of the 218 publications have contributed to the following fifteen clinical care and public health guidelines: 1. National Clinical Guideline Centre NICE clinical guideline CG181 Lipid modification (2014) 2. Diabetes, Pre-Diabetes and Cardiovascular Diseases developed with the EASD ESC Clinical Practice Guidelines (2013) 3. Dyslipidaemias 2016 (Management of) ESC Clinical Practice Guidelines (2011) 4. Dyslipidaemias 2016 (Management of) ESC Clinical Practice Guidelines (2016) 5. Arterial Hypertension (Management of) ESC Clinical Practice Guidelines (2013) 6. CVD Prevention in Clinical Practice (European Guidelines on) (2016) 7. Factors Influencing the Decline in Stroke Mortality: A Statement from the American Heart Association/American Stroke Association (2013) 8. Genetics and Genomics for the Prevention and Treatment of Cardiovascular Disease: Update A Scientific Statement From the American Heart Association (2013) 9. Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association (2014) 10. Update on Prevention of Cardiovascular Disease in Adults With Type 2 Diabetes Mellitus in Light of Recent Evidence: A Scientific Statement From the American Heart Association and the American Diabetes Association 11. Social Determinants of Risk and Outcomes for Cardiovascular Disease: A Scientific Statement From the American Heart Association 12. Future Translational Applications From the Contemporary Genomics Era: A Scientific Statement From the American Heart Association 13. Basic Concepts and Potential Applications of Genetics and Genomics for Cardiovascular and Stroke Clinicians: A Scientific Statement From the American Heart Association 14. Salt Sensitivity of Blood Pressure: A Scientific Statement From the American Heart Association 15. Preventing and Experiencing Ischemic Heart Disease as a Woman: State of the Science. A Scientific Statement from the American Heart Association

Expected Benefits:

The benefits are labelled in a way that match specific outputs to be produced (section 5c, above)

1) The BWHHS has on-going work on neighbourhood deprivation focusing directly on government policy to narrow the gap between the most deprived areas and the rest of the country, with the aim to provide a tool to explore the specific attributes of the built environment that affect health in elderly people (work to be completed by end of 2018).

2) BWHHS’s work on metabolomics and proteomics will help to discover which substances in the blood should be measured (called biomarker of therapeutic efficacy) to better inform the development of new medications. The expectation is that this work will help to discover such blood markers that will help to produce new medications to raise the levels of the good-cholesterol (HDL-cholesterol) and hence to improve cardiovascular health (work to be completed by end of 2018).

3) As new cardiovascular medications, still under-patent (e.g. PCSK9 inhibitors), are being adopted by healthcare systems world-wide, there will be a resurgence on risk-prediction models. BWHHS’s work on metabolomics and proteomics and cardiovascular disease is expected to identify new ways to know who in the future will suffer from a cardiovascular disease, this is called risk prediction (work to be completed by end 2020).

4) BWHHS’s work on Mendelian randomization confirm the potential of a genomic lead strategy to not only identify and validate new drug-target , but also to discover more adequate biomarkers of therapeutic efficacy and overall help to optimise the drug discovery process (work to be completed by end of 2020).

Outputs:

The BWHHS has led to 218 publications in peer reviewed journals to date addressing the aims outlined above.

The BWHHS typically produces manuscripts. Those working on the study do not typically engage with the end users of the study’s findings. UCL has an infrastructure for communications and determines the dissemination strategy for outputs on a case by case basis depending on factors such as the quality and impact of the manuscript. UCL’s press office may engage with the media and the public. There are examples on the webpage.

The following manuscripts were delivered within the last year:

• "Causal Associations of Adiposity and Body Fat Distribution With Coronary Heart Disease, Stroke
Subtypes, and Type 2 Diabetes Mellitus: A Mendelian Randomization Analysis” was published in
Circulation in 2017.
• “Investigating the importance of the local food environment for fruit and vegetable intake in older men
and women in 20 UK towns: a cross-sectional analysis of two national cohorts using novel methods” was
published in Int J Behav Nutr Phys Act. 2017”
• "Functional Analysis of the Coronary Heart Disease Risk Locus on Chromosome 21q22." was published in
Disease Markers in 2017.
• "Identifying low density lipoprotein cholesterol associated variants in the Annexin A2 (ANXA2) gene."
was published in Atherosclerosis in 2017.
• "Optimising measurement of health-related characteristics of the built environment: Comparing data
collected by foot-based street audits, virtual street audits and routine secondary data sources." was
published in Health Place in 2017.
• A manuscript titled "PCSK9 genetic variants and risk of type 2 diabetes: a mendelian randomisation
study." was published in Lancet Diabetes & Endocrinology in 2017.
• A manuscript titled "HAPRAP: a haplotype-based iterative method for statistical fine mapping using
GWAS summary statistics." was published in Bioinformatics in 2017.
• A manuscript titled "Worldwide trends in blood pressure from 1975 to 2015: a pooled analysis of 1479
population-based measurement studies with 19.1 million participants” was published in Lancet in 2017.
• A manuscript titled "Challenges of monitoring global diabetes prevalence." was published in Lancet
Diabetes & Endocrinology in 2017.

Data from the BWHHS was used for PhD training. In the last year 2 PhD thesis that used BWHHS data were awarded:

“Barcoding’ Cardiovascular risk: Predicting cardiovascular disease in patients with systemic lupus erthematosus (SLE)” was completed in 2017.
“Mitochondrial DNA copy number as a phenotypic trait for human diseases in genetic epidemiological studies” was completed in 2017.

The following outputs will be produced:

Scientific papers are being written to investigate:

1) What components of the physical environment (e.g. access to green space) where participants of the BWHHS live affect lifestyle behaviours (e.g. physical activity) known to affect the risk of cardiovascular disorders. This will improve understanding of how the modification of the physical environment could positively impact the cardiovascular health of older women in the UK. The analysis has been completed and the manuscript is currently being drafted. Once completed, it will be submitted in the coming year to peer-review journals such as International Journal of Epidemiology.

2) What substance in the blood (called metabolites) are responsible for correlation of the high levels of good-cholesterol (called HDL-cholesterol) with cardiovascular health. This work will help to inform development of new medications that by raising the levels of the good-cholesterol may lead to an improve cardiovascular health. The analysis has been completed and the manuscript is currently being drafted. Once completed, it will be submitted in the coming year to peer-review journals such as Circulation.

3) Best ways to identify if someone will develop in the future a heart attack or a stroke. This is through the analysis of thousands of substances in the blood (what scientists call proteomics) and is intended to be more accurate than current ways used in clinical practice by GP’s and specialists. The analysis will commence in November (once new data is received) and it is estimated that the analysis will be complete and the manuscript ready for publication by mid-2018. The manuscript will be submitted to peer-review journals such as Circulation.

4) New drug-targets for prevention of cardiovascular disorders. By combining the information on the genetic information of BWHHS participants with the levels of substances in the blood (called proteomics and metabolomics), it is expected that targets for new medications aimed to prevent the occurrence of heart attacks or stroke will be identified. This strategy is called Mendelian randomization. Future work will use this strategy extensively.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

This agreement permits continued retention of the data only. The agreement does not permit any other processing of the data.

Identifying details for study participants were supplied to NHS Digital’s predecessor organisation. These were matched to the patient entries on the NHS Central Register (since replaced by the Personal Demographics Service) and a ‘flag’ was added to each matched entry to facilitate ongoing event reporting.

NHS Digital (and predecessors) provided routine notifications of deaths, cancer registrations and exits from or re-entries to registration with the NHS along with the latest recorded demographic data for members of the cohort.

To ensure that type 2 patient objections can be appropriately upheld, NHS Digital purged the list of participants it currently holds and the cohort will be re-flagged in two subgroups.

UCL will supply to NHS Digital two separate lists of study participants: one listing the individuals who have given informed consent and one listing the individuals for whom section 251 support permits the processing of their personal data without informed consent. The lists will contain the following identifiers: NHS Number, Date of Birth, Surname, Forename, Postcode , Gender and unique study ID .

NHS Digital will trace the relevant patient entries for each participant and flag them as being part of the relevant subgroup. NHS Digital will return separate reports for each subgroup to UCL. These will list all individuals who were successfully traced and, for those covered by section 251 support, who had not registered a type 2 patient objection.

UCL will review the data it has historically received from NHS Digital (and predecessors) and if any data was provided for individuals who were not included in either of the lists UCL will send to NHS Digital (described above), UCL must securely destroy the data of those individuals.

NHS Digital will then provide routine notifications of cancer registrations, deaths and changes to status of registration with the NHS. These will include personal demographics data.

UCL stores the data on a server in its secure safe haven facility which can be remotely accessed at the UCL Institute of Health Informatics.

Data will only be accessed by individuals within the BWHHS team at the UCL Institute of Health Informatics who have authorisation from the study director to access the data.

Participant identifiers are stored in a table that is kept separate to the research dataset which contains all other collated study data. Access to the identifiers is restricted to a small number of authorised individuals – all of whom are substantive employees of UCL – and are used for administrative purposes only such as contact with participants and requesting primary care data from participants’ GPs. The information has only been managed (received, entered and stored) by four people during the 18 years of the study.

From the data provided by NHS Digital, the UCL staff (described above) produce derivations that are added to the separate research dataset which contains only pseudonymised data from multiple sources including data from the baseline interview, questionnaires, GP record reviews and biological variables measured from biological specimens. This dataset does not contain participants’ names, NHS numbers, Dates of Birth or Dates of Death. Instead, it contains variables such as participants’ age rather than Date of Birth and, rather than Date of Death or Date of Cancer Registration, it contains calculated time to event from the date of recruitment (baseline). The baseline date is not included. Full details of how data provided by NHS Digital is converted into data added to the research dataset are given below. All subsequent analyses use only the data in the research dataset.

Extracts from the research dataset are shared with UCL staff and external collaborators for use in analyses in support of the BWHHS objectives subject to a formal approval process (outlined below). Following an audit by NHS Digital in February 2017 the BWHHS team has developed a standard operating procedure for sharing of data. External researchers wishing to collaborate with the BWHHS team will have access to the standard operating procedure document which will be available on the newly updated website at the Institute of Health Informatics, UCL.

Prospective collaborators contact the BWHHS study coordinator to informally discuss the feasibility of the proposed project, to ensure minimum criteria are met and to discuss any questions regarding the application before passing onto the BWHHS Study Director.

The application requires the submission of the BWHHS Collaborator’s Request Form, completed and signed to confirm that prospective researchers have read the terms and conditions before initial approval by the BWHHS research team and the Study Director. The application requires the following information:
• Name, affiliation and contact details of the principal investigator, who should be the main applicant.
• Project Title.
• Aims – should be focused and as specific as possible.
• Higher degree – For those that do not involve a BWHHS researcher (i.e. UCL employees working with
BWHHS and with approval to access BWHHS data and will be ONS approved researchers and have
attended information governance and data safe haven training.), the main applicant should be the
Principal Investigator or PhD supervisor. These applications will be considered as a regular data sharing
project, no additional supervision should be expected from any BWHHS researcher.
• Details of funding – The applicant should specify whether they are seeking funding to carry out the
project and if the grant application is partially or totally based in the use of BWHHS data.
• Variables – list of exact variable names and state from which data collection. Variables requested must be
consistent –with the project proposal.
• Biological samples required.
• Ethical issues.
• Timing of study.
• Signature and date to confirm that the applicant has read and will comply with the terms and conditions.

All requests are reviewed by the BWHHS study coordinator followed by a review with the BWHHS Director. The Director may deem it necessary to hold discussions with other members of the BWHHS team, before deciding whether to approve the application. Where necessary external peer review will take place.

BWHHS’s internal review process will consider:
• That there is no overlap – internal projects
• Researchers have the necessary skills
• Bona Fide Researchers - BWHHS will ensure that collaborators have conducted high quality, ethical
projects for research purposes using rigorous scientific methods. They must also have a formal
relationship with a bona fide research organisation, which is an established academic institution,
research body or organisation with the capability to lead or participate in high quality, ethical research.
• Investigators are based and data is stored in the European Economic Area
• That request is adequate – data requested must consist of the variables necessary to answer the research
question correctly and to a high quality.
• That requested data is relevant to the proposal
• It is not an excessive request –only data necessary to answer the specific research question for which the
request was made will be provided.
• Data is only intended to be used for the aims highlighted in the request form for a specific research
question. If future research questions arise, the investigators will complete a new request for
collaboration.
• No intention to pass on the data to a third party not named in the request
• That request follows BWHHS’s rules of data pseudonymisation and data release as detailed below.

Once the completed and signed proposal has been reviewed and authorised by the PI, the study co-ordinator for BWHHS will be responsible for compiling the pseudonymised dataset.

Any project involving external collaborators is recommended to actively involve at least one member of the BWHHS team at UCL. Further use of data outside those specifically stated in the application form will require a new application being submitted and approval from the BWHHS team.

Sharing the data with collaborators allows wider research, maximising its value. Data shared would include only necessary variables for the research question from any of the sources including, event data from GP records, participant self-reported data, data from biological samples, data from the baseline medical examination and the derived variables from NHS Digital data. No NHS Digital data is shared with collaborators.

The data shared with collaborators complies with the following standards:

The BWHHS does not share the following variables: individual identifiers, names, addresses, postcode, telephone numbers, NHS numbers, GP details, place of date of birth. BWHHS will also not share data on low frequency events (n<30), from which an individual may be re-identified.

The BWHHS does not share any sensitive data including occupations or information on mental health.

The following restrictions have been applied to the data released to all users.

1. Records are pseudonymised and identified by a unique BWHHS numerical ID.

2. All datasets will be stripped of specific variables that can create a risk of participant identification including:

• Date of Birth
• Date of clinical events – This includes date of recruitment (baseline)
• Any variable with a low prevalence

3. Index of multiple deprivation and super output area variables are only used by members of the BWHHS team.

4. ICD codes are hidden behind generic labels such as MI or stroke. A separate variable will specify whether cause of death is underlying, direct or both. The following describes the data received from NHS Digital and shared and an explanation of the derived variables created before sharing.

• Date of Birth – Shared as Derived Variable By subtracting baseline dates from birth dates and dividing by
365.25 BWHHS get the age at baseline. Age in years at baseline is shared.
• ICD-codes for direct & underlying causes of death - Shared as Derived Variable Using the event specific
ICD10 codes, variables are created for the specified outcome classifying them as an underlying cause or
direct cause. If both are true then classification is for Underlying + Direct. If neither are true the
outcome is not a cause of death. Result on whether the outcome played a role in the cause of death
either as a direct (D) or as an underlying (U) cause or as both (D+U) is shared.
• Date of Death – Shared as Derived Variable by subtracting each death date from baseline dates BWHHS
get the number of days to death after the baseline date. Number of days from baseline to death is
shared.

5. Instead of date of birth, age at baseline is provided in years, but not the specific date it was collected (actual date of birth falls in a range of 40 months).

6. Dates of clinical events are not provided; instead this will be time to event as days of event from or to
baseline.

7. Data will be transferred through the Data Safe Haven file transfer mechanism only.

This data will be stored in an Access database where all other BWHHS data is held. Only authorised researchers will have access to this data for the purpose of analysis for publications. All staff working with this data will undergo compulsory annual training in information governance run by UCL Information services Division.

Prior to transfer of the dataset another BWHHS researcher will check the dataset contains authorised variables only, and that personal information has not been mistakenly included. A record of this check is made on BWHHS’s collaboration record along with the collaborators details, project details, date data set is generated and date dataset is confirmed to be destroyed.

All organisations party to this Agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract - i.e. employees, agents and contractors of the Data Recipient who may have access to that data).


A study exploring the relationships between cognitive and sensory impairments and experiences of abuse and discrimination — DARS-NIC-164594-K4C5N

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 – s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2019-03-01 — 2020-02-28 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Adult Psychiatric Morbidity Survey
  2. Adult Psychiatric Morbidity Survey (APMS)

Objectives:

The applicant has published previously on the 2000 and 2007 APMS datasets. The 2014 dataset is the first of this recurring survey, the first of which took place in 1993, to be helped by NHS Digital. The survey is undertaken by NatCen, supported by a writing group of academics (including the applicant) from several Universities including UCL who undertake secondary data analyses, that have contributed a significant body of knowledge on mental wellbeing in the UK over the past 25 years. UCL are requesting access to the 2014 APMS Data Set for the purposes of a study instigated by a clinical reader at the UCL Division of Psychiatry.

This study will explore how and why cognitive and sensory impairments might be associated with worse mental health; and how health and social disadvantages might increase the risk of experiencing abusive and discriminatory experiences. The UK Department of Health defines abuse as “a violation of an individual's human and civil rights by another person(s)" (2). Abuse is defined by the impact, rather than intention of actions or inactions on an individual. Hearing and visual loss are very common, with hearing loss alone affecting a third of people aged 65 and over. Sensory impairments can present barriers to social engagement, and people who experience sensory impairments are more likely to require help from others in Activities of Daily Living. Sensory impairments might therefore be a risk factor for experiencing abuse and discrimination, although this has not been studied before.

The APMS survey data would be transferred to UCL data safe haven, where it would be analysed. The only outputs would be for publication in research journals, and no participant would be identifiable from these publications. No other organisations are involved in the data analysis.

Previous publications using the 2007 APMS dataset from UCL have found that older people are less likely to receive evidence-based treatment for common mental disorders (CMD). In the 2000 APMS data set, it was reported that the lower quality of life previously reported by people with cognitive impairment is due to the greater physical and mental health problems in this population, rather than to cognitive impairment per se. These findings have helped to build overall understanding that it is often the adverse circumstances of vulnerable groups – barriers to accessing treatment and physical morbidity – that results in their vulnerability to CMD. In the planned study UCL wish to extend this work to people with sensory impairment.

The purpose of the project is to conduct secondary analysis of the APMS 2014 data, to further understand how sensory and cognitive impairments might impact on abusive and discriminatory experiences, in order to increase scientific understanding in this area. There is a paucity of research currently on the mental health of people with sensory impairment in the UK, which is why the applicant wishes to explore the proposed work plan. The research team at UCL want to find out how having a sensory impairment can impact on mental health and social experiences, so ways to improve lives of people with sensory impairments can be explored. The researchers intend to publish findings in an open access scientific journal. The secondary analysis wok was initiated by UCL in discussion with NatCen and the wider APMS writing group.

This work is not independently funded and will be carried out by the applicant and a team of researchers based at UCL. The work was instigated within the remit of the applicants post as clinical reader at UCL division of psychiatry. No organisations other than UCL are involved in the planned data analysis, all those involved in the processing of the data are substantive employees of UCL. No elements of the work will take place outside the UK.

Yielded Benefits:

Secondary analyses of the writing groups 2007 APMS have already been widely cited. The 2014 APMS report, published by NHS Digital cites the contribution of UCL's previous work in this area and its influence on the 2014 survey that UCL are requesting as follows: “Analyses of APMS 2007 data indicated that white people were the ethnic group most likely to receive mental health treatment (Cooper et al. 2013) and that people of working age were more likely than older people to get appropriate treatment, especially psychological therapy (Cooper et al. 2010). APMS 2014 allowed researchers to examine whether these inequalities have persisted, and (due to the introduction of a new question in 2014) whether some groups of people are more likely to have requested mental health treatment but not received it than other groups.” A paper (listed on previous page as paper 1) that used the APMS data to explore the relationship between sensory impairment and common mental illnesses, and assessed social functioning as a mediator of this relationship as well as the 'treatment gap', has now been submitted for publication.

Expected Benefits:

There is a lack of research currently on the mental health of people with sensory impairment in the UK, which is why UCL wishes to explore the proposed work plan. Previous work carried out by this group of researchers has raised awareness of the mental health needs of older people It is expected that this research will allow for increased scientific knowledge of how cognitive and sensory impairments might impact on the likelihood of experiencing abuse or discrimination. The benefit of exploring the mental health of people with sensory impairment in the UK is so that more information is available to a variety of clinicians and organisations to make better decisions in the provision of health care regarding those affected. With the results derived from the statistical analysis undertaken, this extra scientific knowledge can be disseminated as relevant information to a variety of people.

The employees of UCL involved in this research project are well placed to disseminate knowledge directly into mental health services through a number of channels;

1. Via professional work as clinical psychiatrists. The research can be disseminated directly to colleagues, and discussed/used in continuing professional development forums to improve the standard of health care for people with sensory impairment in the UK. If the researchers are able to understand to what extent and how people with sensory impairments, for example, are at greater risk of mental disorder, how this will inform targeting/planning of future talking therapies, and adaptation to remove barriers to this population accessing mental health services.

2. The applicant is also the lead on the Alzheimer’s Society Centre of Excellence – Independence at Home which is based at University College London. Any findings that are directly related to patients affected by Alzheimer’s can be disseminated via the work here – to inform future work programmes that can address any inequalities identified in the analysis, and to help improve the standard of healthcare given to those who have sensory or cognitive impairments.

3. They are well placed and connected with a variety of Third Sector Organisations (Charities). Any findings that are directly relevant to the older generation can be disseminated to Age UK to inform future targeted work programmes again to address the inequalities discovered through the research to help better the healthcare provision for those with cognitive and sensory impairments, and to remove the barriers that this population may face when accessing mental health services.

Outputs:

UCL will disseminate findings through peer reviewed journal papers, supported by UCL media press release where findings are likely to be of widespread national interest.
The following is a list of planned papers for publication in peer reviewed journals; it is anticipated these will be published by February 2020:

1. A paper exploring the relationships between sensory impairment and mental health and possible mediators of this relationship, as well as the 'treatment gap' for people with sensory impairments relative to those without.
2. A paper exploring the relationships between cognitive impairment and mental health and possible mediators of this relationship
3. A paper exploring how sensory impairment is associated with suicidal ideation and any mediators of an association
4. A paper exploring what predicts whether people who are experiencing symptoms of common mental disorder self-diagnose or receive a professional diagnosis
5. A paper exploring sensory impairment and psychosis and possible mediators of any association.

Target journals include British Journal of Psychiatry, Lancet Psychiatry, International Psychogeriatics. The papers are intended of the scientific community, as well as interested members of clinical professions and the public. The applicant organisation has collaborated with the Alzheimer’s society on previous press releases.

All outputs published will be in the form of aggregated outputs with small numbers suppressed (as is in line with the HES-Analysis guide).

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).

The 2014 APMS dataset is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole dataset; there is no facility to select individual variables. They will be able to download the dataset from UKDS for the period specific within the DSA and they must securely destroy all local copies of the dataset when the DSA expires and notify DARS in line with standard procedures. This 2014 version of the dataset available via DARS has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

Once an active data sharing agreement is in place, UKDS will transfer the pseudonymised APMS data to UCL. It will be transferred and accessed within the Data Safe Haven. This is UCL's data service for storing, handling and analysing identifiable data. It has been certified to the ISO27001 information security standard and conforms to NHS Digital's Information Governance Toolkit.

The data transferred from UKDS to UCL data safe haven contains survey data that is potentially identifiable. This transfer will be governed by accepted, standard procedures used by UCL Data Safe Haven identifiable data transfer portal and described on their website: http://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln

Analysis of the data will be undertaken using statistical software, in order to understand the mental health needs of those with sensory impairment. All analyses will be performed using data weighted to take account of the complex survey design and of non-response in order to ensure that the results are representative of the British Household population. Conceptual models developed that seek to explain abuse consider: victim vulnerability, abuser stress, psychopathology or impairment, intra-individual dynamics and societal attitudes. The research team at UCL have based the hypotheses of the research project on this model, hypothesising that those who are more vulnerable due to cognitive or sensory impairments are at greater risk of worse mental health and suicidal ideation, abusive or discriminatory experiences, and that this increased risk may be due to dependence on others for care and relative isolation from protective social support

The data will not be stored, processed or in any other way accessible by a third party organisation. The data will be held on, and only analysed on a UCL computer within the UCL Division of Psychiatry.

Data transfers out of UCL will only be in the form of aggregated, non-identifiable data, published as research journal papers, with all small numbers suppressed as is in line with the HES-Analysis guide. Data will not be accessed outside the UK. All those processing the data are substantive employees of University College London. The data NHS Digital supplies will be stored within the Data Safe Haven. The data will not be linked or compared (matched) with other data sets. There will be no attempts to try and re-identify or re link to identifiable record level patient data.


Mental disorders and help seeking for mental or physical health conditions in sexual minorities — DARS-NIC-159576-C0V1M

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2019-05-02 — 2022-05-01 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Adult Psychiatric Morbidity Survey
  2. Adult Psychiatric Morbidity Survey (APMS)

Objectives:

The Division of Psychiatry at UCL’s legal basis for processing personal data under GDPR is function of a public task (by a public organisation) as set out in Article 6(1), point (e) (“necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller” “and the task or function has a clear basis in law.”) and Article 9(2), point (j) (“necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”)

The Division of Psychiatry at UCL has a 20 year history of research into LGBT (lesbian, gay, bisexual, and transgender) mental health and well-being. The Adult Psychiatric Morbidity Survey (APMS) in 2007 was the first time that questions on sexual orientation were included in such a national survey. The chief investigator for this project from UCL collaborated with the survey makers (NatCen) on the most appropriate wording for questions about sexual orientation. Thus, UCL has a long history of research in this field and particularly in collaboration with the APMS survey.

There are many positive aspects to being gay. Evidence suggests that lesbians and gays may be more resourceful and self-reliant, have a wider circle of supportive friends and more disposable income then heterosexuals. However there is also evidence that people who self-identify as gay or lesbian have poorer psychological health and lower social well-being then the heterosexual population in modern Britain. This is puzzling as recent social attitudes to same/sex relationships have become much more positive. The last Adult Psychiatric Morbidity Survey was the first national survey of its kind in the UK to include questions on sexual minority status. The results showed an excess of mental disorder of all types in the lesbian gay and bisexual population (Chakraborty et al 2010). The current research seeks to discover whether this has changed and, in particular, whether mental distress maybe less common in younger generations who have experienced less negative social attitudes.

There is considerable evidence that people who identify as LGBT suffer higher rates of mental disorder than their heterosexual counterparts. However, the sample populations studied are often not probabilistic, the origins of this distress are not always clear and it is not known whether higher levels of distress fall as societies become more accepting of same sex couples. UCL therefore are planning a research study with the following aims:

1. To compare rates of mental disorders and help seeking for mental or physical health conditions in non-heterosexual and heterosexual people in 2007 and 2014.
2. To model predictors of mental distress, well-being and help seeking in non-heterosexual people

The predictors identified, and the interventions suggested to modify them, will have important clinical and public health implications. This study therefore has significant potential to address health inequalities.

The study population consists of people aged less than 65 years (people aged 65+ were not asked about their sexuality in 2014) who responded to either the 2007 or 2014 Adult Psychiatric Morbidity Survey (APMS). The study design takes the form of cross-sectional studies carried out in England.

Chakraborty A T, McManus S, Bebbington P, Brugha T, Nicholson S, & King M (2010). Mental health of the non-heterosexual population of England. British Journal of Psychiatry: 198, 143-148.

Yielded Benefits:

The research has yielded no tangible benefits so far as it has taken the statistician longer than envisaged to carry out the analyses. The approach to the data and analysis must be as painstaking as possible as it is complex, and the researchers wish to make certain that the analyses and interpretation are accurate. Additionally, as this work is unfunded and funded work with more pressing deadlines has taken priority, there has been a delay in completing the work. Furthermore, the clinical trials unit (PRIMENT at UCL) in which the researchers work has been distracted with the international nature of some trials with regard to Brexit, for example supply of medications for CTIMPS (drug trials: Clinical Trial of Investigational Medicinal Products).

Expected Benefits:

Using these data, it will be possible to see whether lesbians, gays and bisexual people have excess mental distress and the factors that are associated with this. Information will also be available on positive aspects of wellbeing and resilience, and whether there have been changes since the 2007 survey. This is a key issue for the health of sexual minorities. It is also a current priority for Government, which has recently published a large survey of the health of sexual minorities. The problem with this survey, however, is that it is a much less representative sample of LGBT people than those in the APMS study. Thus, this analysis will be an important contribution to the current state of knowledge. UCL plan to disseminate it at the International Meeting of the Royal College of Psychiatrists International Congress, as well as at the annual meeting of the World Association of Social Psychiatry. Study findings will be published in leading journals such as Lancet Psychiatry, the British Journal of Psychiatry, and World Psychiatry. The clinical experience within the team will maximise the chances of our findings being translated into clinical and public health recommendations regarding interventional work. Our links with the PRIMENT Clinical Trials Unit will improve the chances of successful applications for trial funding to evaluate such interventions. Members of our research team have participated in a Government Equalities Office (GEO) to advise on effective interventions to improve LGBT mental health. These policy links will enhance the application of our research findings on effective interventions to policy developments.
This AMPS work therefore represents an important foundation for future work to benefit the health of LGB people, in line with the government’s LGBT Action Plan 2018.

UCL will take action to improve mental healthcare for LGBT people. The Department of Health and Social Care and the Government Equalities Office will jointly develop a plan focused on reducing suicides amongst the LGBT population. The Department of Health and Social Care will ensure LGBT people’s needs are addressed in the updated Suicide Prevention Strategy, and the new Health Education England suicide prevention competency framework will cover high-risk groups including LGBT people.

There have been large positive changes in the UK and other western societies in terms of acceptance of LGBT rights, including equal marriage and the Equality Act. Thus, society should have begun to see changes in their mental health. However, UCL's recent analysis of data from an English birth cohort of non-heterosexual people aged 20 (the Avon Longitudinal Study of Parents and Children, ALSPAC) indicates that levels of psychological distress remain markedly elevated over that of their heterosexual peers (Irish et al, 2019). This is puzzling and needs replication in a national sample that is more representative of England. If UCL find a similar picture in the APMS data, available co-variates will need to be explored in more detail to understand why that might be. This will inform more targeted prevention, as well as support and treatment for those affected in the general population.

The 2007 APMS was the first to ask a question about sexual orientation and so the results of UCL's analysis (which combines the 2007 and 2014 surveys) of mental health and service use in LGBT people will be crucial in determining how best services, schools and the wider community can act together to improve the health of this sexual minority. This may involve health and well-being campaigns that are directed at the whole community and not just LGBT people, who unfortunately are still too often the targets of exclusion and discrimination.

Outputs:

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

The results of this research will be published in high-impact peer-reviewed journals and presented at conferences (for example, the Royal College of Psychiatrists Annual International Congress 2019; the Health Studies User Conference 2019). In particular, results will be published in journals concerned with mental health and medical practice. The peer-reviewed articles will be available on the UCL repository giving free access to either the published article or the accepted manuscript depending on the journal. The aim is to have the work published by the end of 2019.

To reach a broader audience UCL will also use Twitter, UCL blogs, Mental Elf blogs, and media such as “The Conversation” to disseminate the key messages of our research, and their clinical implications.

UCL plan to disseminate and recommend action in late 2018 or early 2019. Organisations involved in implementation include the Royal College of Psychiatrist, the Mental Health Foundation and Stonewall. The principal investigator (PI) for this project is on the Executive of the Royal College of Psychiatrist’s Special Interest Group in LGB mental health and thus has a direct line to their policy units.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

The 2014 APMS data set is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole data set; there is no facility to select individual variables. UCL will be able to download the data set from UKDS for the period specified within the DSA and must securely destroy all local copies of the data set when the DSA expires and notify NHS Digital in line with standard procedures. This 2014 version of the data set available has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

Once an active data sharing agreement is in place, UKDS will transfer the pseudonymised APMS data to UCL. It will be transferred and accessed within the Data Safe Haven. This is UCL's data service for storing, handling and analysing identifiable data. It has been certified to the ISO27001 information security standard and conforms to NHS Digital's Information Governance Toolkit.

The data transferred from UKDS to UCL data safe haven contains survey data that is potentially identifiable. This transfer will be governed by accepted, standard procedures used by UCL Data Safe Haven identifiable data transfer portal and described on their website: http://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln. Data will also be stored in the UCL Data Safe Haven which is IG Toolkit assured, dual-factor authenticated, access is determined on a need-to-know basis, firewall has a default deny policy, data enters via a managed file transfer mechanism and only the information asset owner has permission by default to draw down any data. All those on the study team who will analyse data are required to complete Data Security Training, as provided by Health Education England, and to complete the authorisation process. Data will be destroyed in time for the termination of the Data Sharing Agreement with NHS Digital so that only the outputs remain. A notice of destruction will be issued in writing to confirm, 90 days after the user has clicked ‘delete’, once the backup cycle has overwritten the files and the data have not been restored in the intervening time.

All analyses will make use of the weightings provided with the data sets to give results that are applicable to the population and account for the primary sampling unit. New weightings for the 2007 survey, provided in 2016 will be used. Descriptive statistics will also be carried out unweighted (representing those in the data sets only). All weighted analyses will be carried using Stata version 14 survey (svy) commands.

Descriptive analysis
Descriptive analyses will be undertaken for each survey (2007 and 2014) as well as in the whole data set combined. However for some of the variables below, it will not be possible to examine these by year of survey because questions were changed somewhat between surveys. If any particular outcome from those noted above is not available for either year, it will only be reported in the analysis for the year it is available.

Variables to be included:
Social, demographic and personal: such as age, sex, children, and social support.

Mental health outcomes:
The main outcome of interest is what is called common mental disorder. This is a variable which combines all the diagnoses which come under the general rubric of depression and anxiety. Other factors such as severe mental illness, self-harm and drug use will also be examined. However UCL do not wish to lose sight of positive factors such as emotional well-being and will also present group comparisons on these measures.

Help-seeking and use of services:
The 2014 APMS data set is rich in information on reported help seeking and so UCL shall explore to what extent people have received counselling, their contacts with primary medical care and social care services, and whether they have been admitted to hospital.

Other variables:
UCL will also explore difference between heterosexual and non-heterosexual people on factors such as spiritual beliefs and religious practice.

The differences between the data contained in the 2007 and 2014 APMS will enable UCL to examine change with time in terms of what is called ‘period effects’, to determine whether there are differences for people of similar age and background between the two time points.

In more sophisticated analyses UCL will explore the possible reasons for hypothesised elevated rates of psychological distress in LGB people, for example discrimination, childhood abuse and neglect, parenting, interpersonal violence, and bullying. UCL will examine the effect of age, sex and general health on the associations observed and where possible examine men and women separately. In some instances this may be limited by numbers. If numbers become very small in sub-categories, UCL shall amalgamate the two groups into the more general comparison of heterosexual versus non-heterosexual.

There will be no data linkage undertaken with NHS Digital data provided under this agreement that is not already noted in the agreement.

Data will only be accessed and processed by substantive employees of University College London and will not be accessed or processed by any other third parties not mentioned in this agreement.

This research will only report aggregate data showing patterns overall; meaning no individual will be able to be identified.


MR1129 - SCORAD Feasibility Study — DARS-NIC-156409-W056Z

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable

Legal basis: , Informed Patient consent to permit the receipt, processing and release of data by NHS Digital

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2011-10-19 — 2026-10-18 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - Cause of Death Report
  2. MRIS - Cohort Event Notification Report
  3. MRIS - Flagging Current Status Report
  4. MRIS - Members and Postings Report
  5. MRIS - Personal Demographics Service
  6. MRIS - Scottish NHS / Registration

Objectives:

SCORAD III: A randomised phase III trial of single fraction radiotherapy compared to multifraction radiotherapy in patients with metastatic spinal cord compression

This trial aims to determine whether patients with spinal cord compression can maintain or regain the ability to move and walk as well after one dose of radiotherapy as after five doses.

It will also examine whether they have similar quality of life and tolerance to treatment regardless of which treatment they receive. It will evaluate single fraction radiotherapy against multifraction radiotherapy in terms of ambulatory status, function, quality of life and toxicity to 3 months and survival to 12 months. This trial is attractive to radiotherapists because either treatment is simple to deliver. Furthermore, the single fraction regimen requires fewer NHS resources, results in both a reduced hospital stay and less movement for patients with spine damage and shortened waiting lists as radiotherapy slots are freed.


MR688 - UK Ductal Carcinoma in Situ (DCIS) Trial — DARS-NIC-148348-XTNPJ

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable

Legal basis: , Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007; National Health Service Act 2006 - s251 - 'Control of patient information'., Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 ; National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2010-09-10 — 2020-09-09 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - Cause of Death Report
  2. MRIS - Cohort Event Notification Report
  3. MRIS - Flagging Current Status Report
  4. MRIS - Members and Postings Report
  5. MRIS - Personal Demographics Service
  6. MRIS - Scottish NHS / Registration

Objectives:

UK Ductal Carcinoma in situ (DCIS) Trial

To perform a randomised 2 x 2 trial to determine, in screen detected DCIS, the effect on the incidence of subsequent invasive breast cancer of complete excision (WLE) alone compared to that of WLE followed by radiotherapy to the residual breast tissue and/or tamoxifen 20mg daily for five years. The incidence of subsequent DCIS and the cause of death were also monitored as was the continues use of hormone replacement therapy or oral contraceptives.


Extended follow-up of the TARGIT A Trial — DARS-NIC-126676-G1X4M

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2019-04-01 — 2022-03-30 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Civil Registration (Deaths) - Secondary Care Cut
  2. Civil Registrations of Death - Secondary Care Cut

Objectives:

Breast cancer remains the most common female malignancy and its incidence continues to rise. The common conventional treatment of early breast cancer involves surgical excision of the tumour and surgery to the axillary lymph nodes. This breast conserving surgery needs to be followed by external beam radiotherapy given over several weeks of daily treatments, given with the intention of reducing the rate of further cancer developing within the operated breast. Whilst this is an effective treatment with a low rate of local recurrence of cancer, laboratory work and its clinical correlation has suggested that radiation to the whole breast may not be necessary in all cases and radiotherapy to the tissue only around the tumour within a risk-adapted approach may be as effective.

University College London (UCL) has been awarded a grant from NIHR Health Technology Assessment (HTA) to run the study Extended follow up of the TARGIT-A trial through the Surgical & Interventional Trials Unit (SITU). The SITU is part of UCL. The SITU specialises in providing infrastructure for running studies and has a history of managing large-scale randomised controlled clinical trials in solid tumours, historically in breast cancer. UCL will be the only organisation with access to the record-level data requested from and supplied by NHS Digital.

The TARGIT-A randomised clinical trial, compared a risk-adapted approach with use of single dose targeted intra-operative radiotherapy (TARGIT IORT) vs. conventional external beam radiotherapy (EBRT) given as a daily course over 3 to 6 weeks. The initial and 5 year results have been published and found that TARGIT-IORT is non-inferior to EBRT.

The published results of the TARGIT-A trial show that compared with conventional external beam radiotherapy given over several weeks, TARGIT given at the time of lumpectomy within a risk-adapted approach achieves much the same results in terms of breast cancer control (locally and systemically). Interestingly, TARGIT was found to have a significantly lower mortality from causes other than breast cancer due to fewer deaths from cardiovascular causes and other cancers.

Although the current results are convincing enough for the treatment to be adopted worldwide (over 20,000 women have now had this treatment worldwide), it is essential that all the UK cohort of 608 patients who were randomised into the trial are followed up over a longer period of time and data analysed as per the original TARGIT-A trial protocol. For patients in the UK cohort, their data will come from NHS Digital. The current plan is to analyse as by per-pathology and post-pathology strata as well as subgroup analysis as per hormone receptor status and hormone therapy. Multivariate analysis will also be performed for assessing the predictive value of other tumour and patient factors such as age, tumour size, grade lymph node status, margins, lymphovascular invasion, time since randomisation, etc. The recruitment in the trial was completed in June 2012.

This extended follow-up study will enable timely recording of additional local recurrences and deaths. With a higher number of events, it would be possible to perform meaningful subgroup analysis using predictive factors such as hormone receptors (available data suggests that these have a predictive value), tumour grade and lymph node involvement that would allow fine tuning of patient selection criteria. Furthermore the effect on non-breast-cancer and overall mortality will also be ascertained.

It is expected that the new data (including that from NHS Digital) will significantly influence wider and enthusiastic adoption of this approach that will be greatly welcomed by patients. As a large proportion of such patients are screen-detected, their overtreatment would be avoided by such adoption.

Expected Benefits:

The results from the TARGIT-A trial show that both conventional and novel means of administering radiotherapy produce similar results. SITU would like to continue to collect data about the health status of all patients in the trial to enable the researcher to learn about long-term differences in the effects of these treatments on health.

This extended follow-up study will enable timely recording of additional deaths. Furthermore, the effect on non-breast-cancer and overall mortality will also be ascertained. This study will therefore be expected to produce measurable benefits to the health of NHS patients within the UK, as patients could receive all of their radiotherapy treatment whilst in the operating theatre rather than having to return daily for several weeks for radiotherapy treatment. For the patients, the biggest benefit of having TARGIT-IORT during their lumpectomy procedure, under the same anaesthetic, is that they complete their local treatment in one session and with lower toxicity.

It is anticipated that the results from this study will add to the researchers' overall understanding of breast cancer and how it may be better treated in future. In addition, a successful outcome will mean that the methods for obtaining follow-up information used in this study could be applied to future clinical trials where long-term follow-up of patients is important. Early breast cancer has a very good survival rate, and the vast majority of women will have no problems after their initial treatments (surgery and radiotherapy). Therefore, in the UK these women tend to be discharged from hospital clinical care after three years. However, in the trial we want to obtain follow-up data for at least ten years, so obtaining the information from hospitals is becoming increasingly difficult. Directly contacting patients seems to be a way to obtain the required data in a more straightforward manner, and “fills the gap” between hospital follow-up and ONS data.

For any healthcare system including the NHS, TARGIT-IORT has been shown to be cost effective and incurs a lower overall cost to the NHS. It also reduces the journey times for patients who would otherwise need to travel, on average, 730 miles for their EBRT treatment.

Outputs:

TARGIT-A data has been published several times at different junctures through the trial follow up. This request to obtain Civil Registration data from NHS Digital, will enable SITU to incorporate death data in order to update the dataset used in this publication.

SITU will also aim to publish the results in high impact international journals, such as The Lancet. Patient-level data from NHS Digital will not be published.

In addition, reports of interim results will be provided to the Extended follow-up of TARGIT-A Trial Steering Committee, Sponsor and Funder.

The final report of results is anticipated to be submitted to the funder in 2023. This publication is expected to be open access in a peer reviewed journal such as The Lancet, British Medical Journal, etc.

Further academic papers will be published in open-access, high impact, peer reviewed journals on the methodology and impact on mortality. A patient-friendly version of the findings will be published on the UCL website.

For each paper published, a short presentation will be developed to summarise the findings for a range of stakeholders who have an active interest in the TARGIT method of administering radiotherapy, including health care professionals, patient groups, and/or their carers. Findings will be presented at national (NCRI Cancer Conference event, Association of Breast Surgeons - ABS meeting) and international events (ASCO annual meeting). It is important for patients, their family, and friends, that they are reassured the TARGIT technique has established long term safety and efficacy.

Attendance is planned at national conferences such as The National Cancer Research Institute (NCRI) annual meeting and international conferences such as The American Society of Clinical Oncology (ASCO).

The overall mortality and onset of new cancers are two critical outcomes for this study.

All outputs and publications contain only aggregated data with small numbers suppressed in line with the HES Analysis Guide.

Processing:

The study has been divided into two Work Packages.

Work Package 1: For patients in the UK cohort (England only), continue to gather efficacy, safety and follow-up data to year 10 by contacting patients directly and asking them to consent to have their information collected through Work Package 2 (below), and complete an annual questionnaire.

Work Package 2: Collect death data for UK patients through NHS Digital.

Collection of death data from UK patients through NHS Digital will help improve the completeness of the data, and will support WP1.

Identifiers for the cohort who have consented to be in the study will be sent to NHS Digital. Identifiers include:
~ NHS Number
~ Date of birth
~ Postcode
~ Unique Study ID

In addition, as part of Work Package 1, the patient will be contacted annually directly by SITU and asked to complete a follow-up (questionnaire) form, and return by post in the pre-paid envelope provided. Patients will be followed up until death or withdrawal of consent. Patients can refuse consent at any time by contacting either SITU or the PI, in which case no further contact will be made.

NHS Digital will then return linked Civil Registration data (death data linked to the study identifier) to the SITU unit at UCL.

This data file will then be linked to the existing TARGIT A Trial database (using the Unique Study ID). UCL will extract a subset of the data containing no direct patient identifiers and periodically send this to the Trial Statistician for analysis via a secure encrypted electronic process carried out at UCL. Each patient will only be identified by a unique subject number (e.g. TT 018 001 X).

SITU do not require direct patient identifiers from NHS Digital. SITU already hold identifiable data on current TARGIT A Trial patients. Identifiable data already held is only accessed by authorised individuals within UCL, all of whom are substantive employees.

No attempts will be made to re-identify individuals from the data and the identifiable data will not be made available to any third parties.

In order to comply with the requirements of the grant, the patients will be followed up for 5 years in the first instance, as most patients already have 5 years of follow-up and data needs to be collected until they have been in the study for 10 years. NHS Digital civil registration data is only required from patients who were recruited into the TARGIT A trial from 6 UK based hospitals (London UCL, London Royal Free, London Whittington, London Guy's and Winchester Royal Hampshire.

All data obtained will be held securely in UCL. Patient identifiers (such as name, address, etc.) will be held on a separate Data Safe Haven which is a service that provides a technical solution for storing, handling and analysing identifiable data provided within UCL.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).

Data will only be requested from NHS Digital where patients have provided a complete and valid signed consent form. For avoidance of doubt, the death data will only be requested and will only flow after the consent has taken place.

This application relates solely to the follow-up phase of the study.


The role of IAPT in the prevention of dementia and the amelioration of its impact on service use and co-morbidities (the MODIFY project) — DARS-NIC-157211-T8B2M

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(a),

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-01-20 — 2023-01-19 2020.06 — 2024.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
  2. Mental Health Services Data Set
  3. Mental Health and Learning Disabilities Data Set
  4. Mental Health Minimum Data Set
  5. Civil Registration - Deaths
  6. HES:Civil Registration (Deaths) bridge
  7. Hospital Episode Statistics Admitted Patient Care
  8. Hospital Episode Statistics Accident and Emergency
  9. Hospital Episode Statistics Outpatients
  10. Improving Access to Psychological Therapies Data Set
  11. Emergency Care Data Set (ECDS)
  12. Civil Registration (Deaths) - Secondary Care Cut
  13. HES-ID to MPS-ID HES Accident and Emergency
  14. HES-ID to MPS-ID HES Admitted Patient Care
  15. HES-ID to MPS-ID HES Outpatients
  16. Improving Access to Psychological Therapies Data Set_v1.5
  17. Civil Registrations of Death - Secondary Care Cut
  18. Hospital Episode Statistics Accident and Emergency (HES A and E)
  19. Hospital Episode Statistics Admitted Patient Care (HES APC)
  20. Hospital Episode Statistics Outpatients (HES OP)
  21. Improving Access to Psychological Therapies (IAPT) v1.5
  22. Mental Health and Learning Disabilities Data Set (MHLDDS)
  23. Mental Health Minimum Data Set (MHMDS)
  24. Mental Health Services Data Set (MHSDS)
  25. Improving Access to Psychological Therapies (IAPT) v2

Objectives:

The MODIFY project, otherwise know as “Mental health and other psychological therapy Outcomes; their relationship to Dementia Incidence in the Following Years”, is funded by the Alzheimer's Society and led by University College London researchers. MODIFY aims to enhance understanding of dementia prevention in the UK by examining the role of psychological therapies offered within the England-wide Improving Access to Psychological Therapies (IAPT) services in dementia prevention.

35% of dementia cases are thought to be attributable to modifiable risk factors. Many of these dementia risk factors such as anxiety, depression, social isolation or alcohol use may be modifiable through psychological therapy. Despite this, no one has yet tested whether psychological therapies are associated with reduced future risk of dementia. Consequently, the overarching purpose of this request is to understand whether and how IAPT psychological therapies might play a role in preventing dementia and help those already living with the condition as well as elucidating which factors might affect their utility in doing so.

The researchers plan to create a data resource that links from the ‘NHS Increasing Access to Psychological Therapies’ (IAPT) services to electronic medical records of dementia diagnosis. Using this resource, the researchers will be able to find out whether receiving successful treatment for anxiety and depression is associated with reduced risk of developing dementia. The Applicant has confirmed that this resource and the data within it will not be used for any other studies other than as described in this data sharing agreement.

The researchers will also look into the possibility of measuring change due to other risk factors not treated in IAPT interventions, such as poor sleep, loneliness, isolation, physical inactivity and high alcohol use. If this is possible, it could lead to a further research into whether psychological therapy may help to prevent dementia by changing other risk factors of dementia.

This request for NHS Digital data comprises one component of the MODIFY project which will commence in November 2019. The MODIFY project also includes a feasibility study examining the impact of psychological therapies on dementia relevant outcomes that are not recorded in the national IAPT dataset. This second component will start approximately in March 2020 and will involve prospectively collecting data from several London based IAPT services. This will be subject to a separate IRAS application and it will not involve any data from NHS Digital.

Young-onset dementia is typically defined as emerging from age 30 onwards, therefore the MODIFY project would like to capture this often ignored group of people with early onset dementia in their data analysis.

DATA REQUESTED
The study requires data for adults 20 years and upwards from the start of the study (2012) up to study end in 2022 and referred to IAPT, and their linked Hospital Episode Statistics (HES), Mental Health records and Mortality records as described in data products section of the application. Annual pseudonymised data for IAPT data set and linked records in MHMDS (01 April 2013 – 31 August 2014), MHLDDS (01 Sept 2014 – 30 Nov 2015) or MHSDS (1 Jan 2016 to present (v4 from 1 April 2019) and Mortality data, and HES APC, OP and AE, and ECDS data sets covering the periods 2012-13 to 2021-22.

The proposed dissemination will create a unique longitudinal linked data set, the analysis of which will provide new information on how psychological therapies in general, and IAPT services in particular, might help prevent dementia. It will, for the first time, enable quantification of:
i) the potential longitudinal impact of IAPT interventions on dementia risk and dementia risk factors,
ii) how that impact might be achieved,
iii) what elements of IAPT provision might maximise that impact and
iv) inequalities in access to IAPT as a potential dementia prevention resource.

The processing of the data is being carried out under article 6(1)e and 9(2)j of the 2018 Data Protection Act. The proposal meets the requirements of article 6(1)e (public interest), because the outputs will provide the first data on the potential utility of an England wide NHS service (IAPT psychological therapies) in addressing a key public health issue - the prevention of dementia and reduction of chronic disease burden.

The proposal meets the requirements of article 9(2)j as it relates to scientific research (described in more detail below) conducted in order to provide the public interest outcome listed above. It is of note, that the scientific research conducted based on this dissemination will also substantially enhance knowledge about dementia prevention globally, since this is the first study to directly examine how psychological therapies for anxiety and depression might prevent dementia.

Expected Benefits:

Findings from this work disseminated and communicated as above have the potential to benefit the future provision of healthcare by providing the first data on an important potential benefit (dementia prevention and amelioration) of psychological therapies already offered throughout England via IAPT.

There are between 850,000 – 1,000,000 people with dementia in the UK and a recent Lancet commission report estimated that 35% of dementia risk is explained by modifiable risk factors. Psychological therapies do have the potential to ameliorate several of the Lancet commission identified risks (depression, social isolation) and other known risks too (anxiety). While precise estimates of reduction in risk are not possible to make, if depression alone were eliminated then it is estimated that dementia prevalence could be reduced by 34000 at 2015 rates. Given the devastation and healthcare utilisation costs that dementia brings, this reduction in incidence would be of significant import to population wellbeing and healthcare budgets. Through quantifying and understanding the potential impact of IAPT on dementia risk reduction, appropriate information can be provided to the 500,000 IAPT attendees yearly.
It may also be possible to use results to tailor psychological therapy provision for the 10s of 1,000s of older people who use IAPT services yearly towards dementia reduction (e.g. it may be that some types of psychological therapies offered in IAPT are better at preventing dementia than others and these could be promoted).

A dementia prevention aspect to psychological therapies would also support the current push to increase the availability of psychological therapies for older adults (who numerous reports suggest are under-represented in IAPT). Indeed, since recent studies have found that dementia is a key fear among older adults, having this knowledge might improve access and uptake in and of itself. Furthermore, IAPT psychological therapies are already offered throughout England and, thus, unlike other potential dementia prevention tools, the proposed prevention activity is already in place and has a delivery system England wide.

This work will also generate important research findings and new questions, the dissemination and communication of which will stimulate researchers and clinicians internationally to develop new research projects and clinical practices aimed at dementia prevention.

Finally, the examination of whether there might be inequalities in access to IAPT services, which might be relevant to dementia prevention, could provide the basis for campaigns to reduce inequalities in IAPT access on dementia prevention grounds. As a consequence of all the above, the proposed dissemination is in the public interest

The benefits above will be achieved following the end of the three-year project as a result of the dissemination and communication of the outputs as discussed in the outputs section to key figures in:
i) Healthcare policy (the founder of IAPT, the director of the UCL Centre for Outcome Research, presentations at the Alzheimer’s Society national research and policy conference)
ii) Clinical practice ( Trust leads for IAPT in London based NHS trusts, integration into training of 150 clinical psychologists and 200 IAPT trainees at UCL, presentations and workshops to clinical audiences)
iii) IAPT service user and dementia movements (Grant co investigators and collaborators include 5 members of the Alzheimer’s Society volunteer research network all of whom are affected by of dementia as well as IAPT service users who are active in promoting IAPT service user interests)
iv) Dementia research

Outputs:

The outputs will include progress reports to Alzheimer's Society, which will be annual with the first report due in July 2020 and the next in 2021, with the final one in July 2022.

It is expected that several empirical papers will be submitted to key relevant peer reviewed journals such as the Lancet Psychiatry, the British Journal of Psychiatry, Alzheimer’s and Dementia and the International Journal of Geriatric Psychiatry. The first of these is expected to be submitted around end of year 1 of the grant (July 2020) with other publications submitted in the three years following that.

There will be conference presentations including presentations at the annual Alzheimer’s Society conference (May/June of July 2020, 2021, 2022) and a planned presentation at the world’s largest international dementia conference the Alzheimer’s Association International Congress (AAIC) in July 2021 and 2022.

OUTPUTS
All data in all outputs will be aggregated with small numbers suppressed as per the HES Analysis Guidance and the IAPT, ECDS and the Mental Health (MHSDS, MHLDDS, MHMDS) data sets Disclosure Policies.

There will be an active process of dissemination of the project to relevant stakeholders as the projects goes on. This will be done through ongoing engagement with the project stakeholder reference group (which includes people affected by dementia, IAPT service users and IAPT clinicians). There will be conference presentations detailed above to audiences that include clinicians, people affected by dementia, depression and anxiety, researchers and policy makers. Summaries of outputs will be published in patient facing journals. The Alzheimer’s Society will publish results in their ‘care and cure’ magazine.

The project also has links with AnxietyUK who would publish results in their newsletter. There will also be direct bilateral engagement with key figures in psychological therapy research and practice. For instance, the founder of IAPT services supported the funding application. Key publications, findings and conference presentations will be disseminated through UCL media channels (including collaborator and departmental twitter accounts). Workshops based on findings will be run with the 150 clinical psychology trainees on the UCL clinical psychology doctorate of which the applicant is Clinical Director. Findings will also be integrated into the IAPT training courses at UCL, over which the applicant, who is involved in the project and a substantive employee of UCL, has oversight. There are direct links to NICE and consequent policy influence through the director of the UCL Centre for Outcomes Research and Effectiveness who is involved with this project and is a substantive employee of UCL. IAPT service leads and senior practitioners in London IAPT services are collaborators on MODIFY and will take findings to senior trust meetings. These contacts in research, policy, clinical practice, dementia and IAPT service user movements will also facilitate the development of new relationships which will be used to further disseminate findings.

There will be active communication to ensure findings reach their intended audience. This will be through many of the dissemination activities and channels listed above but also through a project website (still in production) and promotion of research at conferences with large degrees of lay attendance (e.g. the Alzheimer’s Society Conference). The UCL and Alzheimer’s Society press offices will be contacted at the point of publication of papers or conference presentations to discuss potential dissemination in the wider media. Throughout, there will be consultation with the stakeholder steering group and other partners above as to how best to communicate findings.

There will not be other exploitation of results or outputs. Only substantive employees of UCL, or students on UCL MSc and doctorate courses under the supervision of UCL substantive employees will have access to the data. All outputs outside of this will be aggregated with small numbers suppressed.

Processing:

NHS Digital will link the datasets listed in products (all of which are held by NHS Digital) using a non-identifiable linking key. Aside from these internal NHS Digital linkages there will be no other linkages of this data. The entire linked dataset will be pseudonymised and no identifiable variables have been included in the dataset.

The data flow out of NHS Digital would be the data sets described in products and above, which would include special categories of pseudonymised health data going to the data safe haven at UCL. There will be no subsequent data flows out of UCL. There will be no flow of data into NHS Digital.

The data is not being matched to publicly available data. Re-identification of individuals is not permitted under this agreement.

The data will be held on the UCL data safe haven using UCL approved computers. The Data Safe Haven is UCL's technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital Information Governance Toolkit and ISO 27001 Information Security standard. Access is controlled by the ‘Information Asset Owner’ and they complete training in confidentiality and data protection, which is renewed annually.

System access is from UCL approved computers and is secure requiring the user to enter a username as well as a randomly generated number on a device held by the user which is also combined with a pin and regularly updated password of specified length and complexity.

No organisations other than UCL are involved in the planned data analysis. All those involved in the processing of the data are substantive employees of UCL, or students on UCL MSc and doctorate courses under the supervision of UCL substantive employees, and must a sign and adhere to an Honorary Contract - for which extra wording covering the responsibility for students has been added. The work undertaken by the students is only for the purpose stated in this Purpose section.

No elements of the work will take place outside the UK.

All UCL students are expected to undertake annual training on handling highly confidential information. All Trainees and students register for and complete NHS Digital’s Data Security Awareness (NHSD) course provided by e-Learning for Health. The course covers data security awareness, the law, threats to data security, breaches and incidents, and the General Data Protection Regulation. Completion of the course is sufficient evidence of basic information governance training to handle highly confidential information under the School of Life and Medical Sciences (SLMS) Information Governance Training Policy. Trainees'/ students up-to-date training is recorded in the SLMS IG training register.

All students working on the MODIFY project are students from UCL. All students working on the MODIFY project will undertake the NHS Digital’s Data Security Awareness course provided by e-Learning for Health. UCL has a specific data protection and information security policy, which applies to all staff and students when processing personal data on behalf of UCL. All UCL students working on the MODIFY project are bound by this policy, and that they will face potential sanctions in the event of a breach of the policy.

All students sign up to the UCL's Academic Manual. The Student Academic Misconduct section of the 2019-2020 manual Section 9.1, item 3 states "All instances of Research Misconduct whether by taught students, research students or members of staff will be investigated under UCL’s Procedure for Investigating and Resolving Allegations of Misconduct in Academic Research".

DATA MINIMISATION
• The study requires data for adults aged 20 and above from 2012 (study baseline). There is no upper age limit. Only data is required where an individual also has a record in IAPT data.
• Only periods necessary for analysis have been requested (from the start of IAPT data 2012 to end of study 2022).
• Identifiable data is not required.
• Only fields necessary for analyses of the prevention of dementia or amelioration of its co-morbidities or impacts on health and service use (e.g. the entire maternity category is omitted) have been included.

OUTPUTS
All data in all outputs will be aggregated with small numbers suppressed as per the HES Analysis Guidance and the IAPT, ECDS and the Mental Health (MHSDS, MHLDDS, MHMDS) data sets Disclosure Policies.


MR1393 - Join Dementia Research — DARS-NIC-366913-C2V5F

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - consent provided by participants of research study, No - data flow is not identifiable, Identifiable, Anonymised - ICO Code Compliant, No, Yes (Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(2)(c), Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(2)(c)

Purposes: No (Research)

Sensitive: Non Sensitive, and Sensitive

When:DSA runs 2019-02-01 — 2022-01-31 2017.09 — 2024.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE

Sublicensing allowed: No

Datasets:

  1. MRIS - List Cleaning Report
  2. Demographics

Objectives:

The ‘Join dementia research’ register is a national service funded by the Department of Health; it enables members of the public to register to be contacted about potential research studies. In registering they consent for their information to be available to the dementia research community

The link requested to HSCIC information will ensure people who are deceased are removed from the Department of Health (DOH) letter states that the ‘delegation will run up until September 2015’the ‘register’ of potential research volunteers to ensure that no harm or distress is caused by contacting people who have died.
The intention is to send HSCIC information on all volunteers from the register on a monthly or quarterly basis (depending on cost). The HSCIC will simply confirm if any of the volunteers have died by supplying fact of death. No updated demographics will be provided to University College London (UCL).

Yielded Benefits:

Using the NHS List Cleaning Product has yielded several benefits to several parties: • NIHR CRNCC has been able to remove several hundred deceased volunteers, enabling the Register to meet its required standards; • NIHR CRNCC is able to maintain the currency of the JDR Register; • NIHR CRNCC is able to prevent undue distress to JDR Registrants or their families by ensuring the research staff do not contact bereaved families; • NIHR CRNCC is able keep its promise to JDR Registrants and/or their families by ensuring the Registrant’s details are removed from the Register upon death; • The currency of the Register fosters trust between individuals and encourages participants to sign up; • Increased numbers signing up to the Register increases the likelihood of the JDR system being able to meet the PM challenge. The JDR Register ensures a steady supply of research participants to research studies for which they may have been matched. This increases the level of research into dementia, the potential for improving treatments for those with dementia and the likelihood of finding a cure for this terrible disease. • Nearly 40,000 volunteers have signed up to JDR to be contacted about Research opportunities, from these over 11,000 volunteers have been enrolled into research studies. JDR has been used on over 330 research studies in over 250 NHS, University and commercial sites. Data that has been supplied/will continue to be supplied from NHS Digital will not be used in support af a particular PhD or post graduate research study.

Expected Benefits:

The benefits are that ‘Join Dementia Research’ will be able to process data fairly without unintentionally breaching the undertaking given to volunteers that their identifiable information will be removed from the register after their deaths and a reduction in the risk of causing distress by attempting to contact members of the cohort that have deceased.
The following information provides background on ‘Join Dementia Research’:
The service has been running since July 2014, and was nationally launched in February 2015. The benefits described are already being recognized, but they will increase over the next 2-3 years and the register grows.
Benefits of the register -
‘Join dementia research’ has been funded by the Department of Health and is delivered in partnership with the National Institute for Health Research, Alzheimer Scotland, Alzheimer’s Research UK and the Alzheimer’s Society. Its development was prompted by the Prime Ministers Challenge on Dementia, and it’s purpose is to support the PM Challenge target to ensure 10% of all people with dementia are involved in dementia research.
The benefit of this being that:
a. The system enables everyone in the country aged over 18 has an opportunity to express an interest in being involved in research.
b. All dementia research studies taking place in the UK (funded by government, NIHR, charities and commercial organizations) with ethical approval can use the system. Providing a new and improved way of identifying and recruiting volunteers into vitally important dementia research studies.
c. The traditional way of recruiting dementia research volunteers, in through NHS memory clinics. This method takes time, as researchers wait for suitable subjects to come through clinic. Join dementia research removes this barrier, by having volunteers ready and waiting to join studies. All dementia research studies will recruit more quickly, saving time and money. Currently over 70% of research studies exceed recruitment target times, this system will speed up those times.
d. As a result of studies being concluded more quickly, we can ensure that the findings of those studies can be acted upon and implemented or considered for the benefit of patients and the public. The service will also help ensure that studies funded and delivered across the world could be attracted to take place in the UK.
e. The studies look at prevention, diagnosis, treatment, care and potentially cures for people living with dementia.
f. Over the next 12-18 months we expect the service to have attracted 100,000 volunteers and to become the main mechanism by which researchers find study volunteers.
g. The system is already supporting recruitment to 29 studies (over half of all studies on the NIHR CRN portfolio) and has recruited 219 (over 10%) of all participants into research studies which as the PROTECT study at Kings College, an important study which gathers data to support innovative research to improve our understanding of the ageing brain and why people develop dementia, and EXPEDITION 3 and Eli Lilly study which is testing a new medication for people with mild early dementia symptoms.
h. The service was nationally launched in February, it was announced in the media and here is a link to the press release with comments from Secretary of State for Health and Chief Medical Officer http://news.joindementiaresearch.nihr.ac.uk/press-pack-toolkit/
i. It will contribute to delivery of the Prime Minister Challenge on Dementia target of having 10% of all people with dementia involved in research.
www.joindementiaresearch.nihr.ac.uk

Outputs:

The link to HSCIC will lead to the removal of records of deceased patients from the register which enables ‘Join Dementia Research’ to comply with the following undertaking from the consent forms used when recruiting patients:
“I understand that if I withdraw, or after my death, then all identifiable information will be removed from ‘Join dementia research’.”
The timely removal of deceased patients records will reduce the chances of contacting people who are deceased.
The register will be updated at least quarterly and possibly more frequently.

Processing:

University College London Hospitals NHS Foundation Trust (UCLH) will periodically provide HSCIC with lists of identifying details of patients from the register. The lists will include name, date of birth, NHS number, postcode and gender Using its List Cleaning service, the HSCIC will confirm which patients are deceased. The information is then used to remove deceased patients from the register. Once the deceased patients have been removed from the register, the data supplied by the HSCIC will be permanently deleted. The data provided by HSCIC will not be shared, or processed by any third party and no third party can access records of patients deleted from the register to identify which were reported as deceased by the HSCIC.


MR104 - Regional Heart Study — DARS-NIC-148411-Q64H8

Type of data: information not disclosed for TRE projects

Opt outs honoured: Yes - patient objections upheld, Identifiable, Yes (Section 251, Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2018-03-31 — 2021-03-30 2017.06 — 2024.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), NEWCASTLE UNIVERSITY, UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY OF NEWCASTLE UPON TYNE

Sublicensing allowed: No

Datasets:

  1. MRIS - Cohort Event Notification Report
  2. MRIS - Cause of Death Report
  3. MRIS - Members and Postings Report
  4. Civil Registration - Deaths
  5. Demographics
  6. Cancer Registration Data
  7. MRIS - Flagging Current Status Report
  8. Civil Registrations of Death

Objectives:

The data supplied by the NHS IC to UCL Medical School will be used only for the approved Medical Research project

Yielded Benefits:

Research from the BRHS has been used to shape and change many policies on cardiovascular disease prevention, both nationally and internationally- a selection of these are listed below. Previous research is also cited in guidelines produced by professional organisations for treatment of specific chronic conditions, e.g. NICE guidelines, American Heart Association guidelines for prevention of stroke and transient ischemic attack, for management of cardiovascular disease, and management of patients with ventricular arrhythmias, Australian guidelines for management of cardiovascular disease risk, Joint British Societies management of cardiovascular disease guidelines, Endocrine Society guidelines on hypertriglyceridemia and obesity. Evidence generated from the research has also been used to support local public health programmes, for example, in developing initiatives for primary prevention of CVD and dementia in South East London. The research findings have been published in open access peer-reviewed scientific journals related to public health. Cardiovascular and Stroke Prevention - 2000 UK Parliament Select committee on Health, Memorandum by the Stroke Association (TB 17). - 2003 European Society of Cardiology clinical practice guidelines - European Heart Risk Score. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the score project . European Heart Journal (2003) - 2005 Joint British Societies􀍛􀀃Guidelines on Prevention of Cardiovascular Disease in Clinical Practice. - 2004 NICE - Public health guidance on the prevention of cardiovascular disease (CVD) at population level. - 2007 Management of stable angina. SIGN guidance 96. - 2007 Risk estimation and the prevention of cardiovascular disease. SIGN guidance 97. - 2007 WHO Prevention of Cardiovascular Disease Guidelines for assessment and management of cardiovascular risk. - 2008 Management of patients with stroke or TIA: assessment, investigation, immediate management and secondary prevention A national clinical guideline. SIGN National guideline 108. - 2008 European Guidelines for management of ischaemic stroke and transient ischaemic attack. - 2010 Cardiovascular disease prevention Public health guideline [PH25] NICE Guidance. - 2011 European Guidelines for management of ischaemic stroke and transient ischaemic attack. - 2011 AHA / ASA Guidelines for the Primary Prevention of Stroke - 2014 AHA / ASA Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack. - 2014 AHA / ASA Guidelines for the Primary Prevention of Stroke. - 2014 Joint British Societies􀍛􀀃consensus recommendations for the prevention of cardiovascular disease (JBS3). - 2016 European guidelines on cardiovascular disease prevention in clinical practice. Smoking & Passive smoking - 2016 Stopping Smoking: What health professionals should know and how to encourage smokers to quit: British Thoracic Society Tobacco Specialist Advisory Group March 2016. - 2012 Papers examining the health effects of passive smoking using an objective measure of smoke exposure rather than a self report, were published in 2009-10 and received news media coverage. The findings informed the UK 2012 government campaign about the dangers of passive smoking. The BRHS papers are cited in the evidence about passive smoking and risk of CHD, stroke in the updated Surgeon General report in USA. - 2014: The Health Consequences of Smoking - 50 Years of Progress: A Report of the Surgeon General Editors National Center for Chronic Disease Prevention and Health Promotion (US) Office on Smoking and Health. Atlanta (GA): Centers for Disease Control and Prevention (US); 2014. Alcohol - 2010 Dietary guidelines for Americans: Alcohol. - 2012 House of Commons Science and Technology Committee Alcohol guidelines Eleventh Report of Session 2010-12: Volume II Additional written evidence Ordered by the House of Commons to be published 12 and 19 October 2011. Diabetes - 2008 An Endocrine Society Clinical Practice Guideline. Primary Prevention of Cardiovascular Disease and Type 2 Diabetes in Patients at Metabolic Risk. - 2010 Management of diabetes. SIGN National clinical guideline 116. - 2012 Endocrine Society clinical practice guidelines for Hypertriglyceridemia. - 2014 Lipid modification NICE clinical guideline CG181. - 2015 AHA/ ADA. Update on Prevention of Cardiovascular Disease in Adults With Type 2 Diabetes - 2011 ASA/ACCF/AHA/AANN/AANS/ACR/ASNR/CNS/ SAIP/SCAI/SIR/SNIS/SVM/SVS Guideline on the Management of Patients With Extracranial Carotid and Vertebral Artery Disease. - 2012 UK National Screening Committee. The Handbook for Vascular Risk Assessment, Risk Reduction and Risk Management. - 2015 Endocrine Society clinical practice guidelines for the pharmacological management of obesity. - 2015 NICE Clinical Guideline CG 43 Obesity Prevention. Social Determinants - 2015 AHA Scientific Statement Social Determinants of Risk and Outcomes for Cardiovascular Disease . Physical activity (Not a guideline but a resource) - ACSM's Resource Manual for Guidelines for Exercise Testing and Prescription. edited by David P. Swain, ACSM, Clinton A. Brawner. - 2013 IACR Cardiac Rehabilitation Guidelines.

Expected Benefits:

The BRHS has a track record of providing high quality evidence to improve health of the public in UK and internationally. Global trends of ageing populations will acutely increase the health and social care burdens on individuals and society from chronic diseases such as cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life. Therefore, research in a cohort study of older men aims to establish the contributions of potentially important factors (obesity, diabetes, health behaviours, environmental and social factors) to prevent cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life.

To date the study has published over 500 peer reviewed research papers, providing high quality evidence about the epidemiology of these conditions and improving understanding on how to manage, treat and prevent them.

Importantly, these papers have informed evidence based strategies to reduce the health and social care burden in older populations, as outlined in detail in section “Specific output” above. The researchers have contributed to a range of influential UK and international clinical guidelines for management and treatment of important chronic conditions including CHD, stroke, angina, arrhythmias, and diabetes which together cause substantial burdens of ill health in UK and globally, and will continue to contribute with findings from the new data requested.

The specific benefits from the use of the data will be to generate further high quality research evidence about prevention of chronic diseases and to improve the health of older populations. Linking the existing BRHS databases to NHS Digital data will permit the researcher to study a wider range of public health relevant topics. The potential benefits for prevention of cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life are substantial. Target dates will run from the time of acquiring the data until 2019 with plans to further extend funding for our study.

Outputs:

More than 500 peer-reviewed reports have already been published based on the study which uses mortality data from NHS Digital. It is hoped that research from using the data requested under this Agreement will be published and utilised in the same way.

Research from the BRHS has been used to shape and change many policies on cardiovascular disease prevention, both nationally and internationally, for example, in developing initiatives for primary prevention of CVD and dementia in South East London. The BRHS provide outputs in the form of peer reviewed publications from the research in speciality journals in cardiovascular disease, heart failure, diabetes, stroke and geriatric medicine. It also provides research directly to funding bodies and policy makers (Department of Health, British Heart Foundation, National Institute of Health Research, Medical Research Council, UK Health Forum), clinicians, public health specialists and other health researchers who then use the evidence to develop preventive strategies.

Findings will be further disseminated via national conference presentations including The Society for Social Medicine, the Nutrition Society, the British Geriatric Society, and Public Health England and via international conference presentations including the AHA Epidemiology and Prevention | Lifestyle and Cardiometabolic Health and The International Society of Behavioral Nutrition and Physical Activity meetings.

The Study findings will also be cited in reports by a range of influential national and international public sector bodies including the UK House of Commons Health Select Committee, the UK Department of Health, the U.S. Surgeon General (whose reports inform health policies both in USA and other countries around the world) and the World Health Organisation (e.g. their Guidelines for assessment and management of cardiovascular risk).

All outputs will be restricted to aggregate data with small numbers suppressed in line with the HES Analysis Guide. No publications/ outputs from the British Regional Heart Study have ever presented or will present data which allow the identification of individuals. All data presentation is based on groups of subjects (generally > 50 subjects, often considerably larger numbers). The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.

Processing:

UCL have requested continuation of the monthly updates to the cohort regarding cancer registration, date, fact and cause of death.

The BRHS currently receives data from three sources;

1. Study participants- Physical Examinations - 1978-80, 1998-2000, 2010-2012 and regular postal
questionnaires
2. GP record review - Morbidity data collected annually directly from participants GP
3. NHS Digital - Participants flagged in 1978-80 and the study receives Mortality notification &
Cancer registration on a monthly basis via this existing data sharing agreement. HES, MHMDS and DIDs data under NIC-28591-H5Q3X-v0.18

The personal identifiers are held in the data safe haven and access to this is strictly limited to a few named individuals, all substantive employees of University College London (UCL). UCL will provide NHS Digital with the following cohort identifiers for linkage to the datasets:

1) Study ID
2) NHS Number
3) Date of Birth
4) Sex
5) Last known postcode.

NHS Digital will return a pseudonymised dataset to the applicant containing Study ID and match rank code. UCL's Data manager will then link this NHS Digital pseudonymised dataset to the BRHS cohort data Study ID for analysis.

*****
Only study ids are used to link the NHS Digital data to the BRHS cohort data. No Personal identifiers are contained within this dataset. ******

The data will then be made available to the research team of Medical Statisticians, Epidemiologists and Public Health clinicians, to carry out their research analysis. All the researchers working on the data are substantive employees of UCL.

No publications/outputs from the British Regional Heart Study have ever presented or will present data which allow the identification of individuals. All data presentation is based on groups of subjects (generally >50 subjects, often considerably larger numbers). Therefore all outputs will be restricted to aggregate data with small numbers suppressed in line with the HES Analysis Guide.

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement. Data will not be linked with any other sources, other than those specified in this Agreement.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data). There will be no requirement nor attempt to re-identify individuals from the data. All processing of ONS data will be in line with ONS standard conditions. The data from NHS Digital will not be used for any other purpose other than that outlined in this agreement.


SUMMIT Study: Cancer screening study with or without low-dose lung CT to validate a multi-cancer early detection test (Previously ODR1718_316) — DARS-NIC-656813-F4H5W

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, No (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c)

Purposes: Yes (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-12-20 — 2027-12-19 2023.01 — 2024.04. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: GRAIL, LLC, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Emergency Care Data Set (ECDS)
  2. NDRS Cancer Registry
  3. NDRS Linked DIDs
  4. NDRS Linked HES APC
  5. NDRS Linked HES Outpatient
  6. NDRS National Lung Cancer Audit (NLCA)
  7. NDRS National Radiotherapy Dataset (RTDS)
  8. NDRS Rapid Cancer Registrations
  9. NDRS Somatic Molecular Dataset
  10. NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
  11. NDRS Cancer Registrations

Objectives:

University College London (UCL) and GRAIL, Limited Liability Company (LLC) are requesting NHS Digital record level data for the Study: "SUMMIT: Cancer screening study with or without low-dose lung CT* to validate a multi-cancer early detection test"

(*low-dose computed tomography (also called a low-dose CT scan, or LDCT) is a screening test for lung cancer. During an LDCT scan, you lie on a table and an X-ray machine uses a low dose (amount) of radiation to make detailed images of your lungs. The scan only takes a few minutes and is not painful.)

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). PHE facilitated data release via its Office of Data Release service (ODR). ODR was responsible for providing a common governance framework for responding to requests to access PHE data for secondary purposes, including service improvement, surveillance and ethically approved research. All requests to access data were reviewed by the ODR and were subject to strict confidentiality provisions. The responsibility for the management of the National Disease Registration Service of which the National Cancer Registration and Analysis Service is a part, transferred from PHE to NHS Digital on 1st October 2021. The SUMMIT study previously accessed data via Public Health England under the reference: ODR0718_316.

MAIN AIM AND PURPOSE OF SUMMIT
The SUMMIT Study aims to understand ways to detect lung cancer before there are any symptoms, when treatment can be simpler and more successful.

The SUMMIT Study is a prospective cohort study of approximately 13,000 participants from London designed to investigate how cancer screening can be improved and delivered. The study will recruit individuals at high risk for cancer, especially lung cancer, due to significant smoking history. The study has two main aims:
1. To develop and evaluate the performance of the GRAIL blood test for the detection of multiple cancer types and the identification of tissue of cancer origin
2. To examine the performance and feasibility of delivering a low dose CT (LDCT) screening service for lung cancer to a high-risk population in London and the surrounding area.

***THIS VERSION (v1): OCTOBER 2022***
This is a request to Renew and Amend a Data Sharing Agreement (DSA) with University College London (UCL) and GRAIL, LLC.

STUDY AIMS
The data requested are key to the SUMMIT Study, as it will allow the study team at UCL to understand what types of cancers the participants may develop during the course of, and after LDCT screening. Thereby identifying what types of cancer signals may be present in participants blood. The data requested will also allow the study team to analyse the performance of LDCT screening. LDCT screening has been shown to reduce lung-cancer mortality in at-risk populations by 20-26% and it is hoped that by demonstrating the feasibility of LDCT screening it will enable the study team to make a case for the adoption of a UK national lung cancer screening programme, vastly improving lung cancer outcomes. Similarly the development of a blood test (Galleri Test) for screening would reduce morbidity and mortality for many types of cancers through early detection.
Both GRAIL and UCL blood samples are taken from consenting patients during the SUMMIT study. Genetic testing is being performed by GRAIL, LLC on the samples taken for the purpose of developing the blood test to detect cancer early. UCL may also conduct genetic testing on the samples taken and stored by UCL, however the purpose of this will not be to develop a blood test to detect cancer, but for the purpose of research into lung, cardiac and other diseases. This information is provided to participants on the Patient Information Sheet and Informed Consent form which have received ethical approval from the London - City & East Research Ethics Committee.

NEW DATA REQUESTED
UCL and GRAIL, LLC are requesting further access to the following National Cancer Registration and Analysis Service (NCRAS) National Disease Registration Service (NDRS) datasets (formerly available via Public Health England (PHE):
- NDRS Rapid Cancer Registrations
- NDRS Cancer Registry
- NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
- NDRS National Radiotherapy Dataset (RTDS)
- NDRS Linked HES Admitted Patient Care (APC)
- NDRS Linked HES Outpatient (OP)
- NDRS Somatic Molecular Dataset
- Emergency Care Dataset (ECDS)
- NDRS Linked Diagnostic Imaging Dataset (DIDS)
- NDRS National Lung Cancer Audit (NLCA)

How the data requested will achieve the study aims:
The NDRS Rapid Cancer Registry data and NDRS Cancer Registry data are requested to clinically validate a cell-free nucleic acid (cfNA) based GRAIL blood test for early detection of multiple types of cancer, including lung cancer and also investigate how low dose CT lung cancer screening can be improved and delivered. The additional datasets ECDS, DIDs, HES APC, HES OP, NLCA, RTDS Somatic Molecular Dataset and SACT will all be used for these aims as well as to answer the secondary endpoints of the study including, investigation of the uptake of LDCT screening (and demographic and psychological characteristics of this), examining the harms associated with LDCT screening, and to explore the outcomes following lung cancer screening and subsequent treatment.

All elements of the GRAIL test will be pre-specified prior to combining the assay results with the clinical data including the classifier and the cut-points for defining positive vs negative results. In addition the data should inform on efficient implementation of an LDCT screening service to detect early-stage lung cancer among current and former smokers. This includes understanding important operational parameters such as screening interval, uptake and adherence, and also psychosocial issues such as psychological impact and harms and quality of life measures.

NHS Digital pseudonymised record level data (linked to SUMMIT study ID) from consented participants are required to link to the Galleri Test and LDCT scan test results obtained within the study at participant study visits. In this agreement, NDRS NCRAS data is requested 6 months (183 days) prior to the date the participants consented up to around Winter 2027 (with monthly data drops until December 2023 and then quarterly until the end date of this agreement). The Cancer Registry provide historical cancer data dating back to 1995. Ascertaining the historic cancer diagnosis background of each study participant is important as a historic diagnosis may impact SUMMIT test results or treatment decisions.

The SUMMIT study is currently scheduled to last until 10 years after the last participant was enrolled in the study (last participant enrolled 14/05/2021) at which point the end of study will be declared. An extension to the NDRS Cancer Registry Dataset may be requested in the future to coincide with the end of the study.

Long term follow up data is required to ensure any cancers identified after/outside of SUMMIT LDCT screening or developing late can be linked to any potential biomarkers of cancer in the Galleri blood test or any early radiological signs at screening. In addition, long term follow up data (such as recurrence data) can be assessed alongside radiological parameters collected at LDCT screening (for example assessing whether radiological growth rates can predict post treatment outcomes).

PREVIOUSLY HELD DATA
To note: The study team at UCL currently hold data which was collected via PHE (under ODR0718_316).

The Data sharing contract between UCL and Public Health England (which transferred to NHS Digital via novation agreement on 13/12/2021) granted UCL access to cancer registration data from a bespoke ‘early ascertainment’ dataset. In this version of the agreement (v1) the study team wish to transfer from receiving the Early ascertainment dataset over to the Rapid Registry Dataset. Additionally, the study team also request that GRAIL, LLC and GRAIL Bio UK Ltd are permitted access to the ‘early ascertainment’ datasets which UCL already hold. These datasets will be transferred from UCL to GRAIL, LLC and GRAIL Bio UK Ltd in the same way as described for the NDRS NCRAS datasets applied for in this agreement.

It is hoped that the ODR Early Ascertainment datasets, NDRS Rapid Registry Dataset and NDRS Cancer Registry dataset can all be used to inform on future work related to the development of the blood test to detect cancer early and also for the implementation of a UK National Lung Cancer Screening programme. This work will follow on from the SUMMIT study.

Cancer Research UK (CRUK) provides core funding to UCL Clinical Trials Unit (CTU). All employees working for 'Cancer Research UK and UCL Centre Trials Centre' (UCL CTC) are substantive employees of UCL and not employed directly by CRUK. CRUK does not have direct impact on how the SUMMIT study is to be run, and no CRUK employees will be able to view, access or process any NHS Digital record level data, and are therefore not included as a Data Processor on this agreement.

COMMON LAW DUTY OF CONFIDENTIALITY
The study team at UCL will be providing one cohort for this request containing approximately 13,000 individual records.
Consent to data linkage has been sought for all participants in the study. NHS Digital will not apply National Data Opt-Out for these participants and are content that the consent materials are compatible with the flow of data described in this agreement.

LAWFUL BASIS FOR THE PROCESSING OF PERSONAL DATA (GDPR)
University College London is relying on GDPR Article 6 (1)(e): processing is necessary for the performance of a task carried out in the public interest, and additionally (as health data is a special category or Personal Data), Article 9(2)(j): processing is necessary for the archiving purposes in the public interest, scientific or historical research purposes. Participants that lack capacity to provide fully informed consent were not included in the SUMMIT study and consultee consent was not permitted. Data minimisation processes are being followed and only data that is required specifically for the purposes of this study has been requested, to protect the rights of the data subjects. In addition data will not be collected from participants that have withdrawn their consent to future data collection.

GRAIL LLC is using GDPR Article 6(1)(f) "processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child." Processing personal data is necessary for GRAIL, LLC’s legitimate interests which are described in this application. The data to which access is requested are proportionate and necessary to achieve those interests. GRAIL, LLC has completed a legitimate interests assessment (LIA). The data subjects interests and fundamental rights are protected through appropriate minimisation of fields and patient records being processed; protection of the data in a secure environment, and guaranteeing secure destruction at any stage at the request of NHS Digital or after a defined period on completion of the project. Additionally (as health data is a special category of Personal Data), GRAIL, LLC is also relying on Article 9(2)(j): special category data used for “archiving in the public interest, scientific or historical research or statistical purposes,” with a basis in law. The data in the SUMMIT study is being requested to be used in the public interest, as the SUMMIT study aims to understand ways to detect lung cancer before there are any symptoms, when treatment can be simpler and more successful. If the study team are successful in their endeavours, this intervention could be brought to a wider UK population, thereby vastly improving lung cancer outcomes for UK patients and benefiting the NHS.

PATIENT AND PUBLIC INVOLVEMENT (PPIE):
The acceptability of the SUMMIT study has been discussed with and approved by PPIE and General Practitioner (GP) groups, who have been involved throughout the design and running of the study.

There have been three separate SUMMIT specific face-to-face PPIE sessions, with different members in each session. These members were representative of the group being invited to this study (i.e. smokers and former smokers in the eligible age bracket). The earlier sessions discussed the design and concept of the study, particularly the invitation process. The later sessions were extremely focused and looked in detail at the Participant Information Sheet (PIS), consent form and other documents including the collection and processing of personal data. This group also looked at invitation and results letters sent to the participants in order to report back the results of their LDCT scans. All feedback received from both PPI members and GPs has been considered and incorporated appropriately into the study design.

It was important to get input from a diverse PPI group and these included:
• Eleven attendees plus one phone feedback.
• Nine of the attendees were males and three were females (including phone feedback)
• Cancer patients – 5 of them have received or are currently receiving treatment for cancer but not lung cancer
• Smoking history – 8 of the attendees were either light or heavy smokers
• Three attendees work in the construction industry
• Four attendees are members of the UCLH Cancer Patient and Public Advisory Group
• Two people indicated that they have caring responsibilities for a family member or friend with a cancer diagnosis

The study team have two PPIE members on the Project Steering Group and also a centralised UCL Cancer Trials Centre (CTC) PPIE group to call upon when needed. These members have continued to assist the SUMMIT team understand and accommodate the public perspective on LDCT screening, sampling and data processing throughout the duration of the study. The PPIE members will also be key in interpreting and disseminating the study results.

Organisation Roles and Responsibilities:
The SUMMIT study is an academic study sponsored by UCL and funded by, and run in collaboration with GRAIL, LLC.
• UCL are a joint data controller and lead for this agreement, who also process the data and are responsible for sending participant identifiers to NHS Digital for data linkage and for receiving NHS Digital record level data, downloading onto the Data Safe Haven and sharing with GRAIL, LLC.
• GRAIL, LLC are a joint data controller who also processes the data and are responsible for receiving NHS Digital pseudonymised record level data and sharing with GRAIL Bio UK Ltd.
• GRAIL Bio UK Ltd will be receiving pseudonymised record-level NHS Digital data via GRAIL, LLC and are therefore listed as a Data Processor in this agreement.


NOTE: GRAIL, LLC is the successor in interest to GRAIL, Inc. GRAIL, LLC encompasses all GRAIL locations including GRAIL Bio UK Ltd which is also a data processor located in the UK. GRAIL Bio UK Ltd were not in existence when SUMMIT was set up and open to recruitment. It is agreed by UCL and GRAIL that from the participant information sheet and consent documentation that participants would be aware that their data would be processed by GRAIL encompassing both locations.

For full transparency of the Commercial element of this agreement, it is noted that GRAIL LLC and/or GRAIL Bio UK Ltd may take the results of the SUMMIT study to further refine the algorithm of their MCED test that could add commercial value to their product(s). Therefore in the future, GRAIL LLC and/or GRAIL Bio UK Ltd may receive commercial benefit (including intangible or indirect commercial benefits such as positive publicity) from the successful outcomes of the trial.

Yielded Benefits:

There are no current yielded benefits from receipt of the Early Ascertainment data to document as yet.

Expected Benefits:

Demonstrating the feasibility of LDCT screening for lung cancer should enable the study team to add to the evidence needed to establish a national screening programme in the UK. By successfully carrying out the SUMMIT study, the study team could bring this intervention to a wider UK population, thereby vastly improving lung cancer outcomes. Should a national lung cancer screening programme go ahead, SUMMIT will also be able to provide valuable information to shape this programme. This includes improving current lung cancer risk models (such as Prostate, Lung, Colorectal and Ovarian risk (PLCO)), finding optimal invitation strategies, understanding the demographic and psychological characteristics of participants undergoing screening, optimising screening processes, and also informing how best to implement LDCT screening (e.g. annually vs biannually). It is hoped that this data will improve the uptake and efficiency of future screening programmes and increase the accuracy and sensitivity of lung cancer detection of those participating, ultimately improving treatment outcomes and survival.

As the largest population based LDCT screening study in the UK, the SUMMIT Study continues to be of considerable public benefit; both directly through screening high-risk adults for lung cancer and indirectly through answering outstanding questions the national screening committee have on how to implement LDCT screening in future UK screening programmes. SUMMIT has also helped direct and inform the NHS England commissioned Targeted Lung Health Check programme currently being implemented. Many of the key points learnt from running SUMMIT will be implemented into the roll out of this programme at UCLH.

In the medium term, the development of an early cancer blood test will provide improved cancer screening and earlier diagnosis. A minimally invasive, relatively inexpensive blood test to detect multiple types of cancer will be invaluable to future healthcare worldwide where cancer can be detected earlier when it can be better treated and cured. It is expected that the Galleri test will also predict the origin of the cancer signal with high accuracy to help guide diagnosis. Using the Galleri test alongside existing screening tools is expected to improve early cancer detection for patients at an elevated risk of cancer, such as those aged 50 or older.

Through implementation of LDCT lung cancer screening and a blood test to detect cancer, most patients should be diagnosed earlier that they would otherwise be, with an expected decrease in cancer stage for patients presenting in cancer clinics (a general downshift in cancer stage). Although it is understood that an earlier diagnosis might not benefit every patient, in the majority of cases most cancers will be detected and treated earlier where treatment success and survival rates are better. The 1-year survival rates based on cancer stage between 2013-2017 are below (data from CRUK):
- Stage 1: 87.2%
- Stage 2: 73.0%
- Stage 3: 48.7%
- Stage 4: 19.3%

Outputs:

The ultimate result of the data processing for the SUMMIT study is to develop a blood test to detect cancer early and also implement a national LDCT lung cancer screening programme in the UK.

The expected outputs include submission to peer-reviewed journals, conferences and presentations. The planned journals include; Lancet respiratory medicine, Lancet oncology, European Respiratory Journal and Annals of Oncology.

The first primary outputs are expected in with the target end date of 2023/mid-2024. The study team intend to release further publications once the data matures and more NDRS NCRAS data is received. Primary outputs will be linked to study endpoints:
• To examine LDCT screening delivery using established measures of performance and risk prediction.
• To quantify the uptake of LDCT screening, and examine the demographic and psychological characteristics and smoking status of those who consent to be screened.
• To examine adherence to and practicability of a biennial LDCT versus annual screening.
• To identify the psychological and screening-related factors which predict uptake of, and repeat adherence to, LDCT screening for lung cancer, as well as their sociodemographic and smoking-related correlates.
• To investigate QoL over time, to explore associations with screening adherence, the frequency of screening and abnormal LDCT results.
• To examine the harms associated with LDCT screening.
• To evaluate the performance of the GRAIL test for the detection of lung cancer within 12 months of Y0, Y1 and Y2 timepoints.
• To evaluate the performance of the GRAIL test for the detection of invasive cancer and identification of tissue of cancer origin within 12 months of Y1 and Y2 timepoints.
• To evaluate the performance of the GRAIL test for the detection of invasive cancer and identification of tissue of cancer origin within 24 months of Y0 and Y1 timepoints.
• To evaluate the performance of the GRAIL test by cancer type, stage and method of diagnosis.
• To evaluate association of the GRAIL test result and cause-specific survival (e.g. cancer, cardiovascular) and overall survival.

The plan is to disseminate aggregate results (with small numbers supressed according to the HES Analysis Guide) to public and patient communities, for example on the UCL CTC website, Cancer Research UK (CRUK) website, clinicaltrials.gov and via the HRA Final Report. The study team also plan to send a newsletter summarising the results in lay language for all participants that have taken part in SUMMIT. Newsletters will be reviewed by the SUMMIT PPIE members and by the UCL CTC PPIE group before submission to REC for review. The study team also intend to include the PPIE group in any other dissemination activities within patient groups, such as lung cancer charities, conferences and PPIE Open days.

In addition the outputs of data processing at the end of the study aim to include conference abstracts, reports to NHS England and GRAIL, and submissions of SUMMIT findings to peer reviewed journal(s). The publications will not contain the data, only the results of its statistical analysis that will be summarized overall.

GRAIL may take the results of the SUMMIT study to further refine the algorithm of their MCED test that could add commercial value to their product(s).

Analysis of interim study data has already been published in various journals (see below), posters presented at BTOG 20/22, World Lung 19/20 and ERS 22, and abstracts submitted to BTS 20 and ATS 21:
- Horst C, Dickson JL, Tisi S, Ruparel M, Nair A, Devaraj A, Janes SM. Delivering low-dose CT screening for lung cancer: a pragmatic approach. Thorax 2020;75:831-832.
- Quaife SL, Waller J, Dickson JL, Brain KE, Kurtidu C, McCabe J, Hackshaw A, Duffy SW, Janes SM. Psychological Targets for Lung Cancer Screening Uptake: A Prospective Longitudinal Cohort Study. J Thorac Oncol. 2021 Dec;16(12):2016-2028.
- Dickson JL, Hall H, Horst C, Tisi S, Verghese P, Mullin AM, Teague J, Farrelly L, Bowyer V, Gyertson K, Bojang F, Levermore C, Anastasiadis T, Sennett K, McCabe J, Devaraj A, Nair A, Navani N, Callister ME, Hackshaw A; SUMMIT Consortium, Quaife SL, Janes SM. Telephone risk-based eligibility assessment for low-dose CT lung cancer screening. Thorax. 2022 Jul 21:thoraxjnl-2021-218634.
- Dickson JL, Bhamani A, Quaife SL, Horst C, Tisi S, Hall H, Verghese P, Creamer A, Prendecki R, McCabe J, Gyertson K, Bowyer V, El-Emir E, Cotton A, Mehta S, Bojang F, Levermore C, Mullin AM, Teague J, Farrelly L, Nair A, Devaraj A, Hackshaw A, Janes SM; SUMMIT consortium. The reporting of pulmonary nodule results by letter in a lung cancer screening setting. Lung Cancer. 2022 Jun;168:46-49.

A newsletter has also been disseminated to participants in June 2021 providing an update on the study’s progress.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract). There will not be any access to the data by any third parties.

This application is to request the renewal and amendment of data previously provided under the agreement with the Office for Data Release ODR0718_316.

DATAFLOW
The data flow outlines the high level workflow for how data flow occurs between UCL, GRAIL LLC, GRAIL Bio UK Ltd and NHS Digital for the SUMMIT study. The flow covers two key steps:
1. Transfer of linkage file/identifiers from UCL to NHS Digital.
2. Transfer of linked data from NHS Digital back to UCL, GRAIL LLC and GRAIL Bio UK Ltd.

Transfer of linkage file/identifiers from UCL to NHS Digital:
• UCL will send the list of Patient Identifiable Data (PID) - including participant Study ID, NHS Number, Gender, Date of Birth and Postcode - securely to NHS Digital via a Secure Electronic File Transfer Service (SEFT) or other secure, NHS Digital approved file transfer mechanism. This identifiable data is stored in the SUMMIT Clinical Records Management system (“SCRMS”). This list will only include participants that have consented to SUMMIT and have not withdrawn their consent for future data collection. This is to ensure that participant’s rights to object are respected and to abide by data minimisation principles.
• The NHS Digital data production teams will link patient identifiers to the datasets requested in section 3.
• The NHS Digital production team to remove patient identifiers from the linked data to mitigate any risk of reidentification of participant data.

Transfer of linked data from NHS Digital back to UCL, GRAIL LLC and GRAIL Bio UK:
• NHS Digital transfers the data via SEFT or other secure, NHS Digital approved file transfer mechanism. This data is pseudonymised (only the Study ID from the linkage transfer is kept and identifiable fields are removed).
• UCL team downloads the data onto the UCL Data Safe Haven (DSH) (see below).
• GRAIL, LLC has a UK-specific, permission controlled and encrypted AWS S3 bucket created. A UCL user is designated and given access by the GRAIL team to transfer data from the DSH to this S3 bucket to GRAIL, LLC.
• Once the data is in the S3 bucket, it is replicated to the US and a copy can be ingested into the GRAIL, LLC analysis pipeline.
• GRAIL LLC can then provide the pseudonymised data set to GRAIL Bio UK via a separate AWS S3 bucket.
• UCL, GRAIL, LLC and GRAIL Bio UK Ltd will have a copy of the same dataset.
• Pseudonymised dataset is available in a permission-controlled manner to GRAIL study team users.


STORAGE AND PROCESSING LOCATIONS
Amazon Web Services (AWS UK) supply IT infrastructure for GRAIL Bio UK Ltd and are therefore listed as data processors. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data. AWS UK use only UK data centres and provides a private cloud platform which hosts the Clinical Records Management System (“CRMS”) which was developed by GRAIL, INC (now GRAIL, LLC) and is currently managed by GRAIL Bio UK Ltd. The record-level pseudonymised data extracts referred to in section 5a (above) will be stored in secure S3 folders which are hosted on Amazon Web Service (AWS UK). The PID will be stored separately in the CRMS and is the main PID used to invite and book patients into study appointments . Only authorised study team members of GRAIL Bio UK Ltd and UCL have access to NHS Digital record-level pseudonymised data and PID data stored in the secure S3 folders hosted by AWS in the United Kingdom. Enrolled participants also consent to the transfer and storage of their health data to Grail Bio UK Ltd for the purposes of processing the pseudonymised data extracts for this application.

Amazon Web Services, Inc (USA) supply IT infrastructure for GRAIL, LLC and are therefore listed as data processors. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data. Enrolled participants also consent, as expressly stated in the consent form and participant information sheet, to the transfer of their pseudonymised health data to GRAIL, LLC in the US for purposes permitted by the study participant consent form. The pseudonymised data will be transferred by UCL from the UCL Data Safe Haven to a secure S3 folder hosted on Amazon Web Services, Inc. (USA). The transfer will be undertaken using a secure, encrypted network connection.

The UCL Data Safe Haven (DSH) is a safe haven system which conforms to NHS Digital’s Information Governance Toolkit. Access is via a remote desktop arrangement served via Citrix. Access is controlled via the use of a username, password, PIN and one-time token-based password. The token-based password is generated algorithmically and is changed every minute. Access will only be granted to substantive UCL employees for the purpose of processing outlined in the section above. The data analyses the performance of screening delivery (e.g. uptake and factors that affect uptake) will be undertaken by statisticians at UCL all using pseudonymised data. The Data Safe Haven is subject to external professional penetration testing on an ongoing basis. Failed logon attempts are recorded in the Data Safe Haven system and are managed by the Data Safe Haven Service Operation Manager. Intrusion attempts and port scans are detected and reported to the UCL security function for investigation as necessary. Data is transferred into the system via a secure gateway technology and is then retained via policy and systems that prevent data leakage (for example, through transfer of data to USB media or copy and paste to the client machine). Whilst using the DSH users are prevented from accessing any external network resources (web sites, email, etc). The SLMS Data Safe Haven is certified to ISO 27001:2013. Limited PID is also stored in the DSH for the purposes of SUMMIT research and NHS-D data linkage, this includes Date of birth, Age, GP practice name, NHS number, Postcode, Gender, Ethnicity, IMD score and rank and smoking status. This is stored securely in the UCL DSH, access to which is carefully controlled and only those that have permission to view this PID have access to the UCL data safe haven. Those that do not have permission to access PID will not be able to access to the DSH where the data linkage documents are stored and will not be able to link the data to a specific patient.

Data analysis related to the Galleri Blood test will led by one of GRAIL’s senior bio-statistician/bio-informaticians (employed by the funder, GRAIL LLC), in the US and UK, all using pseudonymised data. Data may be transferred by GRAIL LLC to GRAIL Bio UK.

Data processing will only be carried out by substantive employees of UCL, GRAIL, LLC and GRAIL Bio UK Ltd. All employees with access to NHS Digital Record level data have been appropriately trained in data protection and confidentiality.


MR1a - Health and Development Study - Consented Cohort Members — DARS-NIC-148100-6RFK9

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - consent provided by participants of research study, No - data flow is not identifiable, Y, Identifiable, No (Reasonable Expectation, Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 – s261(7), Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 – s261(2)(c); Health and Social Care Act 2012 – s261(7)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-03-10 — 2022-03-09 2017.09 — 2024.04. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care
  2. Hospital Episode Statistics Outpatients
  3. Hospital Episode Statistics Accident and Emergency
  4. MRIS - Flagging Current Status Report
  5. MRIS - Cause of Death Report
  6. MRIS - Cohort Event Notification Report
  7. MRIS - Scottish NHS / Registration
  8. MRIS - List Cleaning Report
  9. Demographics
  10. Civil Registration - Deaths
  11. Cancer Registration Data
  12. MRIS - Members and Postings Report
  13. Emergency Care Data Set (ECDS)
  14. Hospital Episode Statistics Accident and Emergency (HES A and E)
  15. Hospital Episode Statistics Admitted Patient Care (HES APC)
  16. Hospital Episode Statistics Outpatients (HES OP)
  17. Civil Registrations of Death

Objectives:

The MRC National Survey of Health and Development (NSHD) is the oldest and longest running of the British birth cohort studies. From an initial maternity survey of 13,687 (82%) of all births recorded in England, Scotland and Wales during one week of March, 1946, a socially stratified sample of 5,362 singleton babies born to married parents was selected for follow-up. The NSHD study team is housed within the MRC Unit for Lifelong Health and Ageing (LHA) at University College London (UCL).

The NSHD study team has collected unique lifetime data on body size and maturation, cognitive and physical function, socioeconomic status and diet; and has repeat adult data on diet, smoking, physical activity, blood pressure and lung function. The most intensive data collection in 2006-2010, when study members were aged 60-64 years, included measurement of cardiac structure and function, body composition and bone density.

The 24th and most recent data collection to the whole sample included a postal questionnaire in 2014 and a home visit by a trained research nurse for interview and assessment in 2015/2016. At the 24th follow-up, the target sample was 2816 study members still living in mainland Britain; this is the maximum sample used in the analyses. Of the remaining 2546 (47%) study members: 957 (18%) had already died, 620 (12%) had previously
withdrawn permanently, 574 (11%) lived abroad, and 395 (7%) had remained untraceable for more than 5 years.

Where study members have become lost to follow up, data is being provided under a separate application, NIC-86954. NSHD will use the data under that application to seek to re-contact those study members and invite them to continue participating in the study, i.e. to re-consent these participants.

The NSHD was the first study (in 1971) to have participants flagged on the NHS Central Register for mortality (ICD codes are used to code cause of death) and cancer registrations. The LHA receives notifications on an ongoing quarterly frequency.

The LHA wishes to link NSHD study members to HES data in order to improve the quality of information on hospital admissions and health outcomes for research purposes. Currently, the study obtains self-reported hospital admission data at each follow-up which are then confirmed through contact with each hospital.


The Unit has a 5-year MRC core funded programme of research based on the NSHD with the objective to investigate risk and protective factors from across the life course that influence the ageing process. This core funding has been in place since 1962 and is renewed every five years after scientific review.

The data from HES will be used to improve the identification of acute events such as those caused by cardiovascular disease (CVD). For example, the unit will assess how life course risk factor trajectories of body size, resting heart rate, blood pressure, socio-economic position (SEP) and health related behaviours, accumulate and interact to influence incidence of CVD, thus potentially identifying possibilities for earlier prevention. As the cohort is entering older age, hospital care becomes increasingly frequent and study members are thus less likely to report hospital admissions over a number of years accurately. It is therefore important to capture this information in other ways. New research within LHA on health service use is being developed which will utilise these data and investigate life course predictors of health care utilisation.

The data collected on the NSHD cohort, including that provided by NHS Digital, is used across five research integrated programmes with the overarching aim of identifying social and biological factors that affect lifelong health, ageing and the development of chronic disease risk.

The five programmes are:
1) Enhancing NSHD
2) Functional Trajectories and Cardiovascular Ageing
3) Physical Capability and Musculoskeletal Ageing
4) Mental Ageing
5) Wellbeing in older age

All those with access to the data are substantive employees of University College London.

All processing of ONS data will be in line with ONS standard conditions.

All outputs will be restricted to aggregate data with small numbers supressed in line with the HES analysis guide.

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement. There will be no onward sharing of data as part of this application.

The legal basis for the applicant to continue to hold the Scottish Registration data is consent.

Yielded Benefits:

Delays in obtaining HES data have prevented the study team at UCL from investigating the life course influencers and predictors of health care utilisation. However, since receiving the data, the study team have been working on cleaning and deriving variables that will allow identification of acute events, such as coronary heart disease, heart failure, dementia etc., which will allow UCL to conduct the planned work. Despite the delays, UCL have been able to produce a couple of outputs. For example, work by Dehbi et al (Environment Int. 2017) has provided further evidence of the role of air pollution in cardiovascular mortality, which may be used to influence policies on air quality. Another article in press (British Journal of Psychiatry), looking at adolescent affective symptoms and mortality may be important in assessing health care services.

Expected Benefits:

The NSHD has informed UK health care, education and social policy for 70 years and is the oldest and longest running of the British birth cohort studies. Today, with study members in their early seventies, the NSHD offers a unique opportunity to explore the long-term biological and social processes of ageing and how ageing is affected by factors acting across the whole of life.

Evidence is growing from this cohort study and others, that factors from early life (such as growth, neurodevelopment, nutrition and family socioeconomic circumstances) as well as later life (such as adult smoking, diet, exercise and socioeconomic circumstances) affect the opportunity to age well. This is of interest to policymakers, practitioners, and older people themselves.

The research using NSHD life course information will provide insights into when in the life course interventions to prevent disease (in particular CVD) and, as the cohort age, hospitalisation, will be most useful. This information will inform the design of future interventions which can then be tested in controlled trials. As the study is nationally representative, it will also provide valuable information regarding the factors associated with health care utilisation of the ageing population.

In particular, through knowledge transfer, public engagement, publications, presentations and invited commentaries (http://www.nshd.mrc.ac.uk/findings/) the MRC LHA has contributed to a body of evidence to influence policies and support evidence based medicine. For example, recent paper in PLOS Medicine comparing lifetime trajectories of overweight and obesity across NSHD and the later born cohorts has been cited in the recent Government’s Child Obesity Strategy. Other examples highlighting the depth and breadth of this lifelong study include:

• NSHD is a member of the Dementias Platform UK, a £53 million collaboration between universities and industry established by the MRC in 2014, to transform the best dementia research into the best treatments as quickly as possible. It combines the power of multiple population studies to compare healthy people with people at all stages of dementia.

• The NSHD finding, in 2014, that more rapid rises in systolic blood pressure during midlife (even if not crossing into hypertension) were related to poorer cardiac structure (published in the European Heart Journal in 2014) has implications for treatment guidelines as it suggests that identification and treatment of people with rapidly increasing SBP, even if they are not reaching the criteria for hypertension, may be beneficial in preventing subsequent cardiovascular disease.

• The NSHD findings (published in The Lancet Diabetes & Endocrinology in 2014) suggesting that those who lost weight at any age during adulthood, even if weight was regained later, had better cardiovascular risk profiles than those who remained overweight or obese supports public health strategies that help individuals to lose weight at all ages.

• In 2014, the NSHD finding that better performance in tests of physical capability (i.e. grip strength, chair rising and standing balance) in midlife was linked to higher survival rates over 13 years of follow-up was published in the British Medical Journal. This highlighted the value of these simple objective physical tests in helping to identify those people who from at least as early as midlife onwards may require more support than others to achieve a long and healthy life.

• Subsequent work examining changes in objective measures of physical capability between ages 53 and 60-64 has highlighted that age-related decline may not be entirely inevitable and is potentially modifiable. This work has also suggested that there may be a need to monitor physical capability from at least as early as midlife onwards as opportunities to help some high risk groups may already have been missed if no action is taken until later in life.

• A 2009 report on adult life chances in relation to childhood mental health using NSHD was cited by the government in support of a case for early intervention to build mental capacity and resilience.

• The study’s findings of the continuing effect of early life growth and development on health outcomes in adulthood add to the arguments for early intervention of the kind provided by the national SureStart programme.

• The 1999 paper comparing children’s diet in 1950 with that in the 1990s (‘Food and nutrient intake of a national sample of four-year-old children in 1950: comparison with the 1990s’, Public Health Nutrition) had an impact because of its evidence that the quality and nutrient value of infant and childhood diet had declined between 1950 and 1990.

• The study’s finding (published in All our Future in 1968) of the extent and inequity of the ‘waste of talent’ – in terms of high ability children who did not continue into further or higher education – added to arguments for improving opportunities for, and expectations of, children from poorer families.

• The Home and the School (1964) had a great impact, probably because it provided the first hard evidence that parents and preschool circumstances had a significant impact on ability and attainment at age eight, and so showed that preschool development and experience formed the bedrock on which primary schooling was built.

• Press reports that followed the publication of Maternity in Great Britain (1948), which were concerned with the ‘Need for Better Care and Lower Costs’ (The Times), are likely to have influenced the arguments for improvements in the care of mothers and babies.

Outputs:

The data will be used on an ongoing basis to update study member records. The database will be updated after each data release.

The primary output of the linkages with HES, ONS mortality and Cancer Registration data are the maintenance and enhancement of the NSHD-DR. This is in turn used to achieve multiple research outputs that benefit health and social care.

The programme ‘Enhancing NSHD’ examines many of the genomic, other metabolomic or epigenomic factors that influence the risk of many age-related diseases and quantitative traits, often in collaboration with external researchers.

The programme ‘Functional Trajectories and Cardiovascular Ageing’ examines which factors from across the life course promote good adult cardiovascular function and prevent disease onset, and which increase vulnerability to accelerated cardiovascular ageing.

The programme ‘Physical Capability and Musculoskeletal Ageing’ examines which factors from across the life course promote good adult physical capability and musculoskeletal health, and which increase vulnerability to accelerated decline in capability.

The programme ‘Mental Ageing’ examines which factors from across the life course promote cognitive capability and protect against depression and which factors increase vulnerability to cognitive decline.

The programme ‘Wellbeing in older age’ examines what social contexts and experiences in childhood and early adulthood promote wellbeing in later life and whether wellbeing protects against functional ageing.

Each of these programmes generate multiple publications in peer review journals annually and findings are further disseminated via conference presentations. A full list of publications produced to date plus details of the current priorities for each programme are published on the MRC LHA website at: http://www.nshd.mrc.ac.uk/.

Publications and presentations only use data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

This MRC Unit is committed to research on ageing – outputs arising from ONS data will be anonymised in the form of tables, graphs, peer reviewed journals, presentations and books.

These data have been used in a number of publications. A full list of publications can be found at http://www.nshd.mrc.ac.uk/findings/
Examples of NSHD publications using mortality data are below:
1. Davis D, Cooper R, Terrera GM, Hardy R, Richards M, Kuh D.Verbal memory and search speed in early midlife are associated with mortality over 25 years' follow-up, independently of health status and early life factors: a British birth cohort study.Int J Epidemiol. 2016 Aug 6. pii: dyw100.
2. Zhou CK, Sutcliffe S, Welsh J, Mackinnon K, Kuh D, Hardy R, Cook MB.Is birthweight associated with total and aggressive/lethal prostate cancer risks? A systematic review and meta-analysis.Br J Cancer. 2016 Mar 29;114(7):839-48.
3. Teschendorff AE, Yang Z, Wong A, Pipinikas CP, Jiao Y, Jones A, Anjum S, Hardy R, Salvesen HB, Thirlwell C, Janes SM, Kuh D, Widschwendter M. Correlation of Smoking-Associated DNA Methylation Changes in Buccal Cells With DNA Methylation Changes in Epithelial Cancer. JAMA Oncol. (2015 Jul 1); 1(4):476-85
4. Hartaigh B, Gill TM, Shah I, Hughes AD, Deanfield JE, Kuh D, Hardy R. Association between resting heart rate across the life course and allcause mortality: longitudinal findings from the Medical Research Council (MRC) National Survey of Health and Development (NSHD). J Epidemiol Community Health, 2014 Sep;68(9):8839.
5. Albanese E, Strand BH, Guralnik JM, Patel KV, Kuh D, et al. (2014) Weight Loss and Premature Death: The 1946 British Birth Cohort Study. PLoS ONE 9(1): e86282.
6. Maughan B, Stafford M, Shah I, Kuh D. Adolescent conduct problems and premature mortality: follow up to age 65 in a national birth cohort. Psychological Medicine 2013 Aug 21:110.
7. Ong K, Hardy R, Shah I, Kuh D on behalf of the NSHD scientific and data collection teams. Childhood stunting and mortality between 36 and 64 years: the British 1946 birth cohort study. Journal of Clinical Endocrinology and Metabolism. 2013 May;98(5):20707.
8. Strand BH, Kuh D, Shah I, Guralnik J, Hardy R Childhood, adolescent and early adult body mass index in relation to adult mortality: results from the British 1946 birth cohort. J Epidemiol Community Health. 2012 Mar; 66(3): 225–232.
9. Henderson M, Hotopf M, Shah I, Hayes RD, Kuh D. Psychiatric disorder in early adulthood and risk of premature mortality in the 1946 British Birth Cohort. BMC Psychiatry 2011 Mar 8;11:37.
10. Kuh D, Shah I, Richards M, Mishra G, Wadsworth M, Hardy R. Do childhood cognitive ability or smoking behaviour explain the influence of lifetime socioeconomic conditions on premature adult mortality in a British post war birth cohort? Soc Sci Med. 2009 May; 68(9): 1565–1573.
11. Clennell S, Kuh D, Guralnik J, Patel K, Mishra G. Characterisation of smoking behaviour across the life course and its impact on decline in lung function and allcause mortality: evidence from a British birth cohort. Journal of Epidemiology and Community Health 2008;59:30414.
12. Kuh D, Richards M, Hardy R, Butterworth S, Wadsworth MEJ. Childhood cognitive ability and deaths up until middle age: a post war birth cohort study. International Journal of Epidemiology 2004;33:40813.
13. Kuh D, Hardy R, Langenberg C, Richards M, Wadsworth MEJ. Mortality in adults aged 26-54 years related to socioeconomic conditions in childhood and adulthood: post war birth cohort study. British Medical Journal 2002;325:107680.

Processing:

NSHD receives data from two main sources i) collected from the study members themselves over the past 70 years and ii) from NHS Digital; these data are held in the NSHD-Data Repository (NSHD-DR). Study participants are flagged with NHS Digital. NHS Digital provides notifications of deaths and cancer registrations on a quarterly frequency. These data are incorporated into the NSHD-DR to enhance that dataset for research purposes. The mortality data (fact of death) are also used for administrative purposes. As well as being used to identify specific health events, linkage to HES data will allow the derivation of useful aggregate variables such as number of hospital admissions and length of time in hospital. The derived aggregate variables are then used for other research analyses by LHA scientists and may be shared with external researchers.

In scientific studies in the period that pre-dated the MREC/LREC structure, consent was assumed by participation. In this study, the period of assumed consent covers the years from birth to age 35 years (from 1946 to 1981). Ethical permission for the 1982 and 1989 studies was obtained from the local ethical committees that preceded the LRECs and were run by the teaching hospital to which the NSHD research team were then affiliated (Bristol in 1982 and UCL in 1989). In 1999, MREC approval was obtained for the data collection and its use for research purposes by the team and their collaborations (MREC98/1/121). Ethical approval for the feasibility study (MREC06/Q1407/26) and extension study (07/H1008/245) was obtained from the Central Manchester Research Ethics Committee, and additional Scottish approval (08/MRE00/12) was granted through the Scotland A Research Ethics Committee. Most recently, a favourable opinion was obtained from the London Queen Square REC (14/LO/1073) and Scotland A REC (14/SS/1009).

The consented cohort does include participants who have consented using previous versions of consent material. Consent is taken at face to face contact, which is typically a home visit by a research nurse; this occurs roughly every five to ten years. Consent is sought from study participants, using the updated consent materials, prior to each data collection. Consent materials provided to participants explain the purpose of the data collection and provide the opportunity for individuals to withdraw from the data collection and/or the entire study. The data flow for participants who are either lost to follow-up or non-responders is covered by Section 251 support. The Section 251 support does not cover any participant who has withdrawn their consent.

Derived NHS Digital data will be linked to the NSHD-DR which stores all study member data in pseudonymised form going back to 1946. NHS Digital identifiable data can only be viewed by named NSHD staff and is stored separately from pseudonymised derived data. The NSHD-DR additionally holds hospital admissions data that was previously obtained directly from the hospitals or General Practitioners.


Policy Research Unit for Children, Young People and Families — DARS-NIC-393510-D6H1D

Type of data: information not disclosed for TRE projects

Opt outs honoured: Y, No - data flow is not identifiable, Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012, Section 42(4) of the Statistics and Registration Service Act (2007) as amended by section 287 of the Health and Social Care Act (2012), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), , Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-03-31 — 2022-03-30 2017.06 — 2024.04. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Outpatients
  2. Hospital Episode Statistics Admitted Patient Care
  3. Hospital Episode Statistics Critical Care
  4. Hospital Episode Statistics Accident and Emergency
  5. Office for National Statistics Mortality Data (linkable to HES)
  6. Office for National Statistics Mortality Data
  7. Civil Registration - Deaths
  8. HES:Civil Registration (Deaths) bridge
  9. Emergency Care Data Set (ECDS)
  10. Mental Health and Learning Disabilities Data Set
  11. Mental Health Minimum Data Set
  12. Mental Health Services Data Set
  13. Civil Registration (Deaths) - Secondary Care Cut
  14. Birth Notification Data
  15. Civil Registration - Births
  16. Community Services Data Set
  17. COVID-19 Second Generation Surveillance System
  18. Covid-19 UK Non-hospital Antigen Testing Results (pillar 2)
  19. HES-ID to MPS-ID HES Accident and Emergency
  20. HES-ID to MPS-ID HES Admitted Patient Care
  21. HES-ID to MPS-ID HES Outpatients
  22. MRIS - Bespoke
  23. MSDS (Maternity Services Data Set)
  24. MSDS (Maternity Services Data Set) v1.5
  25. Civil Registrations of Death - Secondary Care Cut
  26. Hospital Episode Statistics Accident and Emergency (HES A and E)
  27. Hospital Episode Statistics Admitted Patient Care (HES APC)
  28. Hospital Episode Statistics Critical Care (HES Critical Care)
  29. Hospital Episode Statistics Outpatients (HES OP)
  30. Community Services Data Set (CSDS)
  31. COVID-19 Second Generation Surveillance System (SGSS)
  32. COVID-19 UK Non-hospital Antigen Testing Results (Pillar 2)
  33. Maternity Services Data Set (MSDS) v1.5
  34. Mental Health and Learning Disabilities Data Set (MHLDDS)
  35. Mental Health Minimum Data Set (MHMDS)
  36. Mental Health Services Data Set (MHSDS)
  37. COVID-19 SGSS First Positives (Second Generation Surveillance System)
  38. Maternity Services Data Set (MSDS) v2

Objectives:

The data is requested for a programme of research within the healthcare provision theme of the Policy Research Unit for Children, Young People and Families, within University College London (UCL) funded by the Department of Health (DoH). The objectives of the research are:

a) To determine variation in use of secondary care services by children and young people over time and their transition to adult services. UCL will analyse variation by patient characteristics (e.g. age, gender, GP registration), and by area/unit level area characteristics such as trust, practice characteristics such as QOF scores, and area indicators for deprivation.

b) To determine risk factors for emergency use of secondary care and risk factors for recurrent use (e.g. according to individual patient characteristics such as age, chronic conditions, deprivation, sex), past use (e.g. frequency and type of past contact such as A&E or admissions). UCL will also examine NHS trust and area factors associated with secondary care use. Where possible, UCL will use birth cohort analyses, based on postnatal admissions of children linked to maternity to maternal risk factors (e.g.: maternal age) and birth factors (e.g.: birth weight, prolonged stay in neonatal intensive care), to investigate associations with risk of emergency use of secondary care and other outcomes, including mortality.

c) UCL will conduct prognostic analyses for children and young people based on diagnosis and procedure codes to identify risk factors for emergency hospital care and for subsequent long-term adverse outcomes into adulthood (e.g.: further emergency admissions or death).

d) UCL would also like to request all ONS death records for deaths registered in England from 1st January 1998 until as late as possible, for all persons who died aged 0-55; in other words, both records that link to HES as well as those that do not link to HES. UCL need all deaths in order to assess the degree of misclassification of outcome (alive/dead) due to linkage errors between ONS and HES datasets. It is crucial that UCL get the age at death on the ONS records in order to do this. Full dates of death is required to be able to estimate age of death in days for the work on infant mortality (for instance, to be able to distinguish between first week from later neonatal deaths and from postneonatal deaths), as well as for the work on cause specific mortality where UCL classify deaths based on admissions within a certain number of days from death. Additionally, having date of death available enables UCL to determine delay in death registration for data validation.


What will be done with the data?
UCL will use longitudinal HES data, linked to ONS death records, to construct cohorts for a number of patient subgroups defined by age, sex, and clinical characteristics, to address the questions above. All analyses will be done within the safe haven. All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Yielded Benefits:

Benefits from the data already received include: The data has shown an increase in adversity-related injury rates (defined as self-inflicted or related to drug/alcohol use or violence) in teenagers in England. The increases may indicate an increasing problem, particularly in females and for intentional self-injury in males aged 15-19 years. UCL aim to extend this research to determine how early preventive interventions in community services (schools, family, and primary care) can affect presentations to hospital. Research has also included work on mortality in vulnerable mothers with opioid use during pregnancy, which found that mothers with opioid use were 11 times more likely to die during the 10 years after childbirth. UCL also explored time to next live birth for vulnerable mothers, finding that women with records indicating vulnerability, such as mental health problems, age <20 years or high parity (defined as the number of pregnancies reaching viable gestational age (including live births and stillbirths), will have a next child faster than the other women. Results have been fed back to the Department of Health and will inform the next years of the CPRU programme.

Expected Benefits:

The research carried out by UCL directly influences DoH policy makers, service providers, healthcare professionals and the general public. This directly benefits the health of children and the healthcare provided to children in the here and now and this is key to reducing future burden on the NHS.

Benefits from the data already received include:

For example, UCLs research recently published in The Lancet, showed similarly increased risks of death over the 10 years after hospital discharge for adolescents hospitalised for self-harm as for those hospitalised for drug or alcohol misuse or violence. The results have led to recommendations for similar psychosocial interventions to be considered for both groups, not just those admitted for self-harm and to include preventive strategies for drug and alcohol misuse, which accounts for just as many deaths in the 10 years after hospital discharge as does suicide. UCLs research will extend this type of preventive thinking to a range of population subgroups within the child and young adult age range and allow follow up to determine long-term, and potentially preventable outcomes.

Additionally, UCLs research on readmissions has shown that for children and young people these occur predominantly in patients with underlying long-term conditions. When looking specifically at 30-day readmissions (emergency readmissions within 30 days of a previous discharge, which are subject to the readmission rule) UCL found that about half of readmissions were for a problem different from the reason for the first emergency admission. This further suggests that readmissions in children and young people are due to complexity of cases rather than hospital failings, and urge a review of the current policy of not reimbursing hospitals for care provided for 30-day readmissions. UCL have also shown that chronic conditions underlie the sharp increase in admissions across the transition from paediatric to adult services which has important implications for specialist care.

The measurable benefits to the health service will be in improving the understanding of longitudinal patterns of emergency health care use overall and which groups (e.g. with chronic conditions) are most at risk. The study will provide new knowledge about long-term outcomes across the child life course and into adulthood. Specifically,
1) assessing the use of hospital service and relevant outcomes, including mortality before and after transition from paediatric to adult health care for young people with chronic conditions,
2) assessing variation in readmission rates by hospital and determine to what extend this variation is due to case mix (based on the full longitudinal hospitalisation record), organisational factors or changes over time.
3) comparing outcomes for vulnerable mothers (e.g. those with a past history of adversity-related injury admissions) and children.

The research (using the new data) will extend this type of preventive thinking to a range of population subgroups within the child and young adult age range. The benefits to the service will be in improving the understanding of longitudinal patterns of emergency health care use overall and which groups (e.g. with chronic conditions) are most at risk. The study will provide new knowledge about long-term outcomes across the child life course and into adulthood. The results may be used to inform NHS services through, for example, targeting of preventive care strategies, evaluation of the quality of care, and development of services and policy to support follow up of risk groups. UCL will also engage with CLARHC about implementation of the research into practical services within UCL Partners.


Finally, UCL have a focus on vulnerable children and families, and use admission data, combined with their indicators for chronic conditions and birth characteristics to explore use of health services for vulnerable mothers and children. All papers are reviewed and commented on by DoH and findings fed back to DoH policy makers as well as more widely, for example, through presentations to young people groups (through the National Children’s Bureau), to the NHS (e.g. through the Child and YoCLARHC), and through trusts to clinicians (e.g. through seminars and CPRU symposia involving patient groups, policy makers and clinicians).

The results may be used to inform NHS services through, for example, targeting of preventive care strategies during pregnancy to support vulnerable mothers and children, evaluation of the quality of care, and development of services and policy to support follow up of risk groups who can be recognised by hospital services (eg those with underlying chronic conditions, or indicators of adversity).

All of the CPRU current and past research can be found on the CPRU website (https://www.ucl.ac.uk/cpru).

Outputs:

The programme of research in this application informs policy and practice and all proposals and outputs are seen and approved by the Department of Health (DoH). Through UCLs Children’s Policy Research Unit (CPRU) there will be regular engagement with the DoH about the projects during development and outputs, and DoH will review outputs to give feedback. All analyses undertaken as part of this programme of research for the policy research unit aim to provide evidence to inform health care professionals, service providers, policy makers, and service users about children’s health and how services meet their needs.

Other outputs include presentations to service providers through meetings with the RCPCH (Royal College of Paediatricians), the North London Collaborations for Leadership in Applied Research and Care (CLARHC) and through the academic health sciences network (AHSN). The findings will also be presented to clinicians at clinical practice meetings including but not limited to the Royal College of Paediatrics and Child Health in 2017 and 2018. This engagement is occurring with the direct goal of changing practice in the health care field.

The findings will also be published in peer reviewed journals and policy briefings for the DoH. The projects in this application are expected to finish by December 2018.

Specifically, the research will inform DoH policy makers, service providers and practitioners about patient and service factors associated with emergency use of secondary care and long-term adverse outcomes through the child life course and into adulthood. The research programme will engage with DoH policy makers, practitioners and public during the research, to refine questions and applications of findings, and during the dissemination phase. In this way, UCL will ensure that the study is relevant to NHS systems and UCLwill endeavour to feedback results to the NHS.

The mechanisms for engagement and dissemination with NHS systems and the public are as follows:

a) UCL has well-established mechanisms for patient and public involvement through CPRU. This is facilitated by the National Children’s Bureau (NCB) Research Centre.

b) The study is conducted as part of a programme of research for the Policy Research Unit (CPRU) for Children, Young People and Families, funded by the Department of Health Policy Research Programme. CPRU aims to improve the health of children, young people and families by undertaking research to provide evidence for health policy and practice. The CPRU program requires regular engagement with policy makers at the Department of Health.

c) The project team at the Great Ormond Street Institute of Child Health contribute to the Academic Health Science Network at UCL Partners AHSN theme on Integrated children and young people’s programme which aims to implement research findings into practice. Engagement is also through the CLARHC, hosted by UCL Partners.

The papers resulting from these studies will be published in peer-reviewed journals (such as the Lancet, Archives of Disease in Childhood, PLoS Medicine, BMJ Open) and presented at scientific conferences (such as the, International Population Data Linkage Conference, International Society for the Prevention of Child Abuse and Neglect, Royal College of Paediatrics and Child Health annual conference, and Informatics for Health conference). UCL aim to present the work at scientific conferences during 2017 and use feedback provided at these meetings to write up papers to be submitted for publication in late 2017 and 2018.

Processing:

Only individuals, working under appropriate supervision on behalf of data controller(s) / processor(s) within this agreement, who are subject to the same policies, procedures and sanctions as substantive employees will have access to the data and only for the purposes described in this agreement.

ONS mortality data will be processed according to the standard Office for National Statistics terms and conditions.

The data will not be shared with third parties or linked to any other datasets.

UCL have no requirement nor will attempt to re-identify the supplied data.

The data requested will be kept in UCLs Data Safe Haven (IDHS). It has been certified to the ISO27001 information security standard and conforms to the NHS Information Governance Toolkit. A file transfer mechanism enables information to be transferred into the Safe Haven simply and securely.

IDHS uses Dual Factor Authentication to access and handle data transferred into the IDHS service. This ensures that only the named applicants will have access to the data from IDHS. Removing data from the Data Safe Haven is only allowed for the PI.

Data flows
When the data extract is available from NHS Digital, a nominated researcher will download the data and immediately transfer it into the UCL data safe haven. Once in the data safe haven, researchers based at the Institute of Child Health and Farr Institute of Health Informatics London (the researchers are all substantive employees of UCL apart from one PHD student) will be able to access the data in the safe haven. The IDHS safe haven operates as a walled space and researchers are not able to connect to the internet or export data from it.

All outputs will be in aggregate form only with small numbers suppressed in line with the HES analysis guide.


PreHOspital Triage for potential stroke patients: lessONs from systems Implemented in response to COVID19 (PHOTONIC) — DARS-NIC-680546-S0V4K

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data, Statutory exemption to flow confidential data without consent)

Legal basis: Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2023-10-10 — 2025-10-09 2024.03 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Civil Registrations of Death - Secondary Care Cut
  2. Emergency Care Data Set (ECDS)
  3. Hospital Episode Statistics Admitted Patient Care (HES APC)

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of the PHOTONIC study.

The following is a summary of the aims of the research project provided by UCL:

“The PHOTONIC study aims to assess how prehospital triage for suspected stroke patients was set up, run and experienced by patients, carers and staff during the COVID-19 (Coronavirus Disease 2019) pandemic. It aims to provide robust evidence on how prehospital video triage affects patient outcomes and cost-effectiveness, with the results used to improve care for both stroke patients and patients displaying stroke symptoms (‘stroke mimics’) which were subsequently diagnosed not to be stroke.

“Some of the most common stroke mimics are seizures, migraine, fainting, serious infections and functional neurological disorder.

“The study will utilise four study areas (North Central London, East Kent, Maidstone, and Darent Valley) where prehospital video triage was implemented since 2020. Throughout the analysis intervention is introduced in different sites at different times and different individuals are in the analysis at each timepoint. This allows for the period before implementation in the intervention sites to act as a further control. The quantitative component of the study will assess how prehospital triage affects patient transfer, care delivery and patient outcomes for stroke patients and mimics using longitudinal data.

“Firstly, the study will analyse how prehospital triage was set up, run, and experienced by patients, carers, and staff.

“Secondly, the study will analyse the impact of prehospital triage on healthcare services and patient outcomes. NHS England data will be studied to analyse whether introducing prehospital triage results in: more patients being taken to the right hospital service; more patients remaining at home and avoiding hospital admission; patients (stroke and non-stroke) getting the right care more quickly, and if there are any adverse effects of prehospital triage; and improving how well patients do after their stroke.

“Thirdly, the study will analyse whether prehospital video triage services are delivering good value for money to the NHS. Analyses will compare performance before and after prehospital triage was introduced. Analyses will also compare the areas that are using prehospital triage with other parts of England that are not currently using it.”

The following NHS England data will be accessed:
• Hospital Episode Statistics Admitted Patient Care - Necessary to identify patients admitted with strokes and stroke mimic symptoms through ICD-10 diagnosis codes and to be able to obtain the overall length of stay for both subsets of patients
• Emergency Care Data Set (ECDS) - necessary to identify which patients went to an emergency department and whether they arrived by ambulance.
o UCL hopes to be able to identify patients with stroke mimic symptoms by identifying those who had an ECDS diagnosis of stroke but no longer a diagnosis of stroke once they were admitted as impatient (after linkage).
• Civil Registration Deaths data - necessary to obtain mortality data for both subsets of patients

The level of data will be pseudonymised.

The data for the cohort will be minimised as follows:
• Limited to data between April 2019 –The latest month available
• Limited to a study cohort identified by NHS England limited to conditions relevant to the study identified by specific diagnosis codes;
• Limited to individuals (in addition to meeting the diagnosis criteria above) who were aged 18 or over at the start of the episode.
• Limited to England - having a larger control group from the whole of England would provide a larger and more representative sample, thereby diluting the effects of potential contamination and heterogeneity in stroke services.
• Civil Registration Deaths will be restricted to individuals meeting the ECDS or HES APC criteria as specified above.

UCL is the research sponsor and the controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
• Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
• Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The study has been funded by the National Institute of Health Research (NIHR) Health and Social Care Delivery Research programme and is specifically for the PHOTONIC study described.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS’ role is limited to secure backup of data stored in UCL’s Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

Patients and the public were central to the study from the outset. The PPIE group includes two stroke survivors and a Stroke Association representative. The patient representatives provided detailed feedback on the wording and content of the research application. This included clarifying the language used in the plain English summary and research questions; patient representatives also influenced how the case for the research was presented (e.g identifying ways in which the research might benefit the NHS at system-level, and encouraging UCL to foreground the need for prompt, appropriate care for both stroke and non-stroke patients), the study design (e.g. helping us clarify the approach to conducting and analysing interviews), and knowledge mobilisation strategy (e.g. identifying several relevant dissemination opportunities).

The team incorporated this valuable feedback throughout. In addition, over the course of preparing this study the team consulted with other patient representatives of the South East Coast Ambluance Service (SECAmb) patient representative group, who confirmed that they are fully supportive of the purpose and approach of this work and have agreed to join the Study Steering Committee. The stroke survivor representatives are full members of the Study team. They will thus participate in the monthly team meetings and contribute to all aspects of the research that they wish to, e.g. research strategy, recruitment documents, interpretation of findings, co-authoring articles and summaries, and presenting at events

Expected Benefits:

It is hoped the findings of this research study will influence UK national guidelines on the use of prehospital video triage for patients with stroke. UCL expect that intervention will improve patient transfer and care delivery, and consequently patient outcomes in terms of length of stay and mortality.

The use of the data could:
• Help the system to better understand the health and care needs of populations.
• Help inform health economic models which will generate incremental cost effectiveness rations (ICER) When judged against cost effectiveness thresholds, ICERs provide policymakers such as the National Institute for Health and Care Excellence (NICE) with information to inform the cost effectiveness of prehospital triage.
• Advance understanding of regional and national trends in health and social care needs.

If the intervention is found not to be effective and/or cost effective, the study will provide the intervention sites with justification to discontinue prehospital video triage for stroke patients. Time and resources can then be used more efficiently elsewhere.

Patients will benefit from findings as policymakers are better informed on the most effective form of emergency care stroke patients. It is crucial to identify whether prehospital video triage results in more efficient and timely care for patients since this will have a positive impact on patient outcomes for stroke patients, and free up resources for the treatment of other patients in the NHS.

The study’s results are of urgence as they will inform sites who are yet to roll out prehospital video triage, preventing them from potentially implementing an intervention which is not beneficial.

It is hoped that publication of findings will add to the body of evidence that is considered by those making decisions that affect patients and the public such as organisations setting policy and strategy or health and care professionals delivering individual care.

The results from this study will feed into a wider body of research exploring the use of prehospital video triage in stroke patients. A national pilot study will roll out trial periods of prehospital triage across England in 2023. This pilot is overseen by a group of individuals, of which two are the National Speciality Adviser and GIRFT National Clinical Lead for Stroke who are both members of the PHOTONIC team. There are therefore vital links in place to ensure the results of PHOTONIC reaches the relevant sites

Outputs:

5c. Specific Outputs Expected, Including Target Date:
The expected outputs of the processing will be:
• A final report of findings to key stakeholders (July 2024)
• Submissions to peer reviewed journals (Open access from July 2024)

The outputs will be communicated to relevant recipients through the following dissemination channels:
• A final report of findings to key stakeholders (July 2024)
• Submission to open access peer reviewed journals (Open access from July 2024)
• Presentations to conferences
• Accessible summaries
• Short films
• Slide sets
• PHOTONIC podcast series
• Webinars

The outputs will not contain NHS England data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived. Outputs from the analysis are expected July 2024.

Processing:

No data will flow to NHS England for the purposes of this agreement.

NHS England data will provide the relevant records from the HES APC, ECDS and Civil Registration Deaths datasets to UCL. The data will contain no direct identifying data items. The data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient. For the avoidance of doubt, the data from NHS England is not linked with any other data.

The data will not be transferred to any other location and will be stored on the servers at UCL in the Data Safe Haven. Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The data will be accessed onsite at the premises of UCL or by authorised personnel from the Institute of Epidemiology and Health Care at UCL via remote access.

The Controller must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:

- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisation’s DSPT (or other security arrangements as per this DSA) and complies with the organisation’s remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The data will not leave England at any time.

Access is restricted to individuals within the Institute of Epidemiology and Health Care of UCL who have authorisation from the Principal Investigator of the PHOTONIC Study. All such individuals are substantive employees of UCL. All personnel accessing the data have been appropriately trained in data protection and confidentiality.

The ECDS and HES APC dataset extracts will be sent separately by NHS England to UCL and linked at the person record level using EPIKEY by authorised personnel at UCL.

The patients in these extracts are linked to Civil Registration (Deaths). UCL will receive the full Date of Death (DOD) and on receipt, UCL will derive 30-day and 90-day mortality indicators and destroy the full DOD.

The analysis will use various models to compare mortality, length of stay and discharge destination between patients in intervention and control areas. These results will feed into the health economic decision models.

There will be no requirement and no attempt to reidentify individuals when using the data. Researchers from the Institute of Epidemiology and Health Care at UCL will analyse the data for the purposes described above.


Using Large-scale Routine Data to Monitor and Improve Ethnic Inequalities in Cancer and Cardiovascular Disease ( ODR1920_301 ) — DARS-NIC-656874-T3L9D

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(2)(a), Other-The Health Service (Control of Patient Information) Regulations 2002- Regulation 2

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-04-28 — 2023-12-31 2024.03 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY OF LEICESTER

Sublicensing allowed: No

Datasets:

  1. NDRS Cancer Registrations
  2. NDRS Linked HES AE
  3. NDRS Linked HES APC
  4. NDRS Linked HES Outpatient

Objectives:

The project aims are to:
1. Investigate ethnicity reporting in cancer and cardiovascular diseases (CVD) data.
2. Characterise the burden of coexisting cancer and CVD in Black Minority Ethnic (BME) groups.

The objectives are:
· Look at the quality of ethnicity reporting in routine healthcare data.
· Determine the incidence and prevalence of co-existing cancer and CVD by ethnic group.

Yielded Benefits:

Data for this study has previously been share when the data were controlled and managed by Public Health England (PHE). As such there are some yielded benefits to be observed from the access to the data for the study prior to NHS Digital becoming data controller. These yielded benefits are noted below; A review of the impact of COVID-19 on multimorbid ethnic minority groups was conducted and published in the JACC cardio-oncology journal.

Expected Benefits:

The importance of the overlap between cancer and cardiovascular disease (CVD) is illustrated by the novel discipline of “cardio-oncology” which has emerged from the recognition that anti-cancer treatments (e.g. chemotherapy) can be associated with adverse CVD complications (e.g. heart failure). There are similarities at the level of risk factors (e.g. tobacco, obesity) which represent opportunities for shared prevention strategies. Better characterisation of coexisting cancer and CVD may lead to improvements in treatment and prevention of both diseases.
Research has found that ethnic minority groups living in white majority European countries have a higher prevalence of multi-morbidity (2 or more chronic illnesses) and have an earlier onset of multi-morbidity than their white counterparts.
Based on this evidence, there is a need for further investigation into the rates of individuals living with both cancer and CVD as well as looking into the occurrence of these co-morbidities across ethnic groups.
In both CVD and cancer, health inequalities for disease incidence, outcomes and treatment have been reported. Ethnicity health data in the UK has historically been inaccurate and there is a need for further research to determine where the gaps are and how it can be improved.

Overlapping Cancer and cardiovascular disease overlap between them in BME is beneficial to health and social care system for the following reasons:

First, it will improve the public’s health and wellbeing, by identifying health inequalities in individuals suffering from both cancer and CVD.

Second, it will improve population health through sustainable health and care services, by maximising the use of existing national audit data and other National Health Service (NHS) programmes to gain new insights to guide the planning of health services.

Moreover, enhancements to coding for individuals and healthcare utilisation by ethnicity are possible from this research.

Third, it will build the capacity and capability of the public health system, by highlighting which individuals and areas have the greatest need in overlapping multi morbidity for the two most common types of disease, by combining datasets from different health audits, for cancer and CVD.

Outputs:

The anticipated outputs for this project will be several recommendations and papers.
The results from the analysis looking at the quality of ethnicity coding in the data are anticipated to be published as a paper, as well as several recommendations to improve the analytical capabilities of routinely collected ethnicity data, and the coding terminology used in the collection of data. The results from the analysis looking at the incidence, hospitalisation and mortality rates for people with both cancer and cardiovascular disease will also be published as papers. Dissemination of the outputs can be done via 3 routes: media; scientific publications and/or presentations and
Patient and Public Involvement (PPI) engagement. The PhD outputs are also expected to be presented at relevant conferences, whether cardiology (e.g. European Society of Cardiology Annual Scientific Congress), public health or ethnicity and health (e.g. South Asian Health Foundation annual conference).
The expected target date for submission of this PhD project 31 August 2023.

Processing:

Within this programme of work there are a range of research questions requiring a variety of analytical strategies. The research team includes epidemiologists and statisticians with a considerable track record of the analysis of similar large datasets.

Initially, we will perform preliminary analyses, which will be exploratory though focused around the key hypotheses, to better understand the different treatment processes for the subsets of patients. Additional preliminary analyses we will then quantify effect sizes using simple logistic regression modelling techniques whilst appropriately accounting for potentially confounding covariates. Where appropriate we will also consider different study designs utilising the rich nature of the linked data resource. For example, matched cohort studies where patients with both cardiovascular disease and cancer are matched to patients suffering from a single condition with similar covariate patterns. Many of the outcomes are of a time-to event nature.

For example, time to revascularisation, time to recurrence of cancer or time to death. For these analyses, we will flexible parametric survival models in order to appropriately account for non-proportional hazards and to potentially account for competing risks where necessary. We will build on previous work utilising excess mortality modelling techniques to understand mortality associated with the diagnosis of multiple conditions 18. Where necessary, we will use mixed effects models to account for the hierarchical nature of the data. The group have experience of quantifying the outputs from complex models in ways that are easily interpretable for a wide variety of audiences. For example, the use of avoidable deaths 32, loss in expectation of life 18, and real-world probabilities 33 accounting for competing risks. We will investigate regional variation across the different analysis strategies and research questions where appropriate. We will utilise mapping techniques and funnel plots to present variation beyond that expected by chance.

This study will not recruit patients but will use existing pseudonymised national audit data for the purposes of research


Childhood outcomes after perinatal brain injury (Data flowing to ONS) — DARS-NIC-342322-Q1N7M

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, Yes (Mixture of confidential data flow(s) with support under section 251 NHS Act 2006 and non-confidential data flow(s))

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2021-10-18 — 2024-10-17 2024.02 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), IMPERIAL COLLEGE LONDON, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Birth Notification Data
  2. Civil Registration - Births
  3. Civil Registration - Deaths
  4. Hospital Episode Statistics Accident and Emergency
  5. Hospital Episode Statistics Admitted Patient Care
  6. Hospital Episode Statistics Outpatients
  7. Mental Health Services Data Set
  8. Civil Registration (Deaths) - Secondary Care Cut
  9. Civil Registrations of Death
  10. Hospital Episode Statistics Accident and Emergency (HES A and E)
  11. Hospital Episode Statistics Admitted Patient Care (HES APC)
  12. Hospital Episode Statistics Outpatients (HES OP)
  13. Mental Health Services Data Set (MHSDS)
  14. Civil Registrations of Death - Secondary Care Cut

Yielded Benefits:

This is a new Data Sharing Agreement. No data has been disseminated by NHS Digital for this research study. There are therefore no yielded benefits to date.

Expected Benefits:

This population study is hoped to provide the most complete picture of how children’s lives are affected by perinatal brain injury, providing essential information to answer parents’ questions accurately and in a meaningful family-centric manner. This information is intended to reshape clinical practice and facilitate optimum service planning within the NHS, to meet the needs of these children and their families through to adulthood, and ultimately improve their future health outcomes. An understanding of the sequelae of perinatal brain injury, specifically how and when children are affected, is expected to inform enhanced developmental surveillance across the NHS and enable the design of targeted multidisciplinary interventions to support children as needed. For example, premature infants (prone to inattention) can benefit from delayed school entry, Special Educational Needs (SEN) support, and educational packages raising awareness amongst educators of these specific challenges.

Anticipated impact on neonatal care, society and NHS services:

Impact on neonatal care
• Equip healthcare professionals with reliable information to counsel families (target date 2024).
• Communication aids will facilitate meaningful family-centred conversations on the neonatal unit (target date 2024)
• Help prepare families for their child’s future and understanding what additional support may be needed (long-term)
• Encourage healthcare professionals to consider the long-term impact of various neonatal care decisions (long-term)

Impact on the NHS and policymakers
• Help those involved in shaping policy, resource planning and service provision to make informed decisions about how to most effectively support these children whilst maximising the efficiency of services (target date 2024-25)
• UCL findings are intended to inform national guidelines on follow-up after brain injury (target date 2024-25)

Impact on schooling and policymakers
• Equip parents with important information about the academic impact of brain injuries to help them plan their child’s future and support them with their educational needs (long-term)
• Provide key information and education to teachers about how they can support children with perinatal brain injuries (long-term)
• Help the Department for Education in determining resource allocation and the provision of additional educational support (long-term)

Outputs:

The research plan to date has been shaped by detailed feedback from charity representatives and over 30 parents and ex-neonatal unit patients via the Great Ormond Street Parent Advisory Committee, the BLISS Insight and Involvement group, and the Meningitis Research Foundation. Parents, via these partner charity organisations, will continue to be involved in focus groups over the lifetime of the study in order to explore the study results and capture their thoughts on what they mean for parents and how best to communicate these results. The Patient and Public Involvement (PPI) work, undertaken in collaboration with the aforementioned charities, highlighted that evidence about the long-term impact of brain injuries (particularly the unseen impact on mental health and schooling) was a frequently overlooked parental priority. It matters to the people most affected.

Academic outputs are hoped to include high-impact peer reviewed publications, and international conference presentations. Findings are expected to be submitted for publication in high impact general medical journals, such as the New England Journal of Medicine, the British Medical Journal, and JAMA Pediatrics. The study results are intended to be presented at international conferences such as the Royal College of Paediatrics and Child Health annual conference, the Kings Fund annual conference, and the Paediatric Academic Societies meeting in the USA.

Publications will be Open Access as per UCL policy, and freely available both on journal websites and via the UCL webpage. Outputs will contain only aggregate level data with small numbers suppressed in line with National Neonatal Research Database (NNRD), NHS Digital, Office for National Statistics (ONS) and Department for Education (DfE) policy and guidance. All data will be stored within the ONS secure research service (SRS) and all outputs from this server undergo independent checks by ONS staff to ensure outputs meet regulations and could not be deemed identifiable in any way.

Dissemination of the research findings to the public (parents who have children with a perinatal brain injury) are intended to be facilitated through existing collaborations with the Neonatal Data Analysis Unit (NDAU), BLISS (the charity for babies born sick or premature) and the Meningitis Research Foundation. UCL are also looking to also create an infographic/ information leaflet to improve communication of prognosis after perinatal brain injury between doctors and parents. Public dissemination is intended to include production of lay research reports publicised on the NDAU, BLISS, UCL and Meningitis Research Foundation websites. Research regarding neonatal outcomes has attracted a high level of media interest, and it is anticipated that this will be the case for the proposed study. UCL are acutely aware of the potential harmful effect of inaccurate or sensational reporting of research findings in this sensitive area, and the confusion and anxiety this can cause for affected families. UCL are planning to work closely with BLISS and Imperial College London to co-ordinate press releases and ensure that information is conveyed accurately and responsibly. BLISS and the Meningitis Research Foundation are also expected to publicise findings to their followers and the general public through their social media channels.

UCL will commence analysing the data as soon as it has been made available in the ONS SRS. It is anticipated that the process of data analysis, interpretation and report writing will take approximately 36 months, with papers submitted for publication in mid to late 2024.

Processing:

The study will involve the following data processing and linkage steps:

1. Infants meeting the Department of Health definition for perinatal brain injury will be identified within the National Neonatal Research Database (NNRD) (cohort 1, n = 40,166). This database contains care data for all neonates admitted to NHS neonatal units across England, Wales and Scotland. Its population coverage is internationally unique with 100% coverage since 2012 and high representative coverage since 2008. The 14,911 premature infants (< 34 weeks gestation) in cohort 1 will be matched to a comparator group of infants within the NNRD (cohort 2, n = 14,911).
2. The pseudonymised neonatal care data for cohort 1 and 2 will be transferred to the ONS Secure Research Service (SRS) by Imperial College London.
3. The NNRD will transfer the minimum identifiers for the NNRD cohorts (1 and 2) to NHS Digital (NHS number, date of birth, sex and postcode at birth). The NNRD will also provide the birth weight, gestation (from 2015), and multiplicity status (i.e. twins, triplets etc) for the remaining 25,255 children with gestation time > 34 weeks in cohort 1 to NHS Digital.
4. The 25,255 un-matched infants in cohort 1 with perinatal brain injury will be matched in a 1:3 ratio, by NHS Digital, to a comparator group of infants, identified from Birth Notifications and Civil Registrations (Births) data to create a ‘term’ control cohort (cohort 3, n = 75,765).
5. All 3 cohorts will be linked to Civil Registrations (Deaths), Hospital Episode Statistics (HES) Admitted Patient Care (APC), HES Accident and Emergency (A&E), HES Outpatients and the Mental Health Services Data Set (MHSDS) up to December 31st 2020, by NHS Digital. The pseudonymised health outcomes and analysis covariates from the Births products for the three cohorts will be transferred from NHS Digital to the ONS SRS.
6. Under DARS-NIC-475526-F3Z5H, a file containing a list of personal identifiers (forename, surname, date of birth, sex, and postcodes) for linkage to the National Pupil Database (NPD) will be transferred from NHS Digital to the Department for Education. The NPD contains detailed information on the educational attainment, special educational needs and attendance of children at state schools across England between the ages of 5-18 years. A logic model, designed to maximise the chance of a reliable postcode match (given the variation over time), will be used. After linkage, all identifiers will be removed (only the unique study ID number will be retained) and these pseudonymised educational data will also be securely transferred for storage within the ONS SRS.

UCL researchers will only have access to pseudonymised data held within the ONS SRS. In order to access any data in the ONS SRS, all researchers will need to be ONS accredited and undergo data protection and confidentiality training. No data will be held by or at UCL. There will be no requirement or attempt to re-identify participants. Indeed, this would not be possible for UCL.

Named UCL research staff responsible for conducting the analysis for the project will complete the ONS Researcher Accreditation process, which involves specific training in the safe use of research data environments. They will sign and adhere to the ONS Accredited Researcher Declaration, and will be required to adhere to ONS data protection policies and procedures. All data to be transferred out of the SRS (the results of the analyses) will be checked by ONS staff to ensure that no individual level data, or potentially identifiable data, is transferred. Only aggregate level data with small number suppression will be transferred out of the SRS system for publication.

Data retention
The linkage keys used for the health and educational linkages will be securely held by NHS Digital and the Department for Education respectively. Only the pseudonymised dataset will be retained within ONS SRS to facilitate analysis by the UCL research team.


MR1470 - Using routine data to identify and assess clinical outcomes for the STAMPEDE trial: Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy. — DARS-NIC-59873-D8C6G

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No, Identifiable, Yes (Consent (Reasonable Expectation), Mixture of confidential data flow(s) with consent and flow(s) with support under section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 - s261(5)(d); Health and Social Care Act 2012 – s261(2)(c); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No, Yes (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-07-01 — 2023-06-30 2021.01 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Demographics
  2. Cancer Registration Data
  3. Civil Registration - Deaths
  4. Civil Registrations of Death

Objectives:

Prostate cancer is a major health problem world-wide and accounts for nearly one fifth of all newly-diagnosed male cancers. In the UK, approximately 48,500 men were diagnosed with prostate cancer in 2017 and over 12,000 men died from the disease.

The Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy (STAMPEDE) trial is a randomised controlled trial which is looking at adding therapies to standard care for men with high-risk prostate cancer starting long-term hormone therapy for the first time. The trial's definitive primary outcome measure is overall survival and intermediate primary outcome measure is failure-free survival.

The Medical Research Council Clinical Trials Unit at University College London (MRC CTU at UCL) is the Data Controller and processes data for this trial.

This project plans to use routinely collected health data within STAMPEDE to identify and assess clinical outcomes. The overall aim of STAMPEDE is to assess novel approaches for the treatment of men with prostate cancer who are starting long-term Androgen Deprivation Therapy (ADT). Open since October 2005, this Multi-Arm-Multi-Stage (MAMS) trial is the largest study of treatments for prostate cancer in the world, and is currently recruiting to three arms:
• Standard of Care (SOC, arm A)
• SOC with Metformin (arm K)
• SOC with Transdermal oestradiol (arm L).

With over 13,000 men recruited, most are from England (85%), with others from Wales (6%), Scotland (7%), N Ireland (2%) and a small number from overseas (1%). To take part in STAMPEDE, patients agree for us to collect information on outcomes such as long-term survival and failure-free survival which can be accessed from routine data sources such as the ONS mortality data and Hospital Episode Statistics (HES).

STAMPEDE initially assessed the effects of several medications: bisphosphonate (zoledronic acid), a cytotoxic chemotherapeutic agent (docetaxel) and a cyclooxygenase (Cox-2) inhibitor (celecoxib), as single agents or combinations, in patients commencing long-term ADT for locally advancing or metastatic prostate cancer. Since the start of the trial, a number of new research arms have been added to STAMPEDE over time to evaluate: abiraterone, a steroid synthesis inhibitor; prostate radiotherapy for patients with newly-diagnosed metastatic disease; enzalutamide, an inhibitor of androgen receptor signalling, given with abiraterone; and metformin, an anti-diabetic medication. In Protocol version 16.0, a new research arm was added for transdermal oestradiol, given as an alternative form of ADT.

The multi-stage element of the trial design allows patient recruitment to discontinue in treatment arms that are not showing sufficient activity, based on a series of pre-planned, interim, lack-of-benefit analyses. In general, MAMS is an adaptive design and can be regarded as one type of group sequential design.

There are eight research arms closed to recruitment, five of which have already reported results, and three arms are now in long-term follow-up with further analyses planned:
• SOC with Abiraterone (arm G)
• SOC with Prostate radiotherapy (arm H)
• SOC with Enzalutamide and abiraterone (arm J)

Over the next three years (2020 to 2023), five analyses are planned with the following comparisons:
1. Arm A vs Arm G: participants with metastatic prostate cancer
2. Arm G vs Arm J: participants with locally advanced prostate cancer
3. Arm A vs Arm J: participants with metastatic prostate cancer
4. Arm A vs Arm H: participants with metastatic prostate cancer
5. Arm A vs Arm K: all participants.

More than half of men taking part in STAMPEDE will survive for five years or more, but the Medical Research Council (MRC) Clinical Trials Unit (CTU) at University College London (UCL) are concerned that some trial centres find it difficult to maintain full follow-up over this period. Flagging data through NHS Digital’s three data products, Demographics, Civil Registration (Deaths) and Cancer Registration Data will allow the MRC CTU at UCL to ensure that deaths and cancer events are captured promptly. Linked data from NHS Digital will therefore improve the estimates of survival, and may also reduce the burden on NHS sites. For the purposes of survival analysis, the MRC CTU will also be able to assume that patients are alive at a set point in time if not reported dead, thereby increasing reliability of data for this study.

Furthermore, no man should die from prostate cancer without prior progression, so a reported death will allow MRC CTU to check that events are not missed on the Case Report Forms (CRFs) that clinicians complete for their trial participants. Based on previous discussions with ONS statisticians, the MRC CTU will make assumptions about the survival of patients not reported as dead.

For the pre-specified primary analysis of time to overall survival or censoring, the MRC CTU uses the date of death to the nearest day (collected on the Death CRF), or time to the day the patient was last known to be alive for individuals who are censored. The aim is to maintain this level of precision in any analyses using information from external sources; using routine data that records time to death or censoring to the nearest day (as opposed to the nearest week or month) enhances the precision with which the MRC CTU will be able to distinguish a difference in survival between two treatment groups.

Provided that the statistical models are correct, this enables MRC CTU's estimates of the survival difference for patients allocated to one treatment relative to another to reflect as closely as possible the reality of any survival difference attributable to treatment allocation in the study setting; and to obtain a greater degree of confidence in understanding of the true effect on survival of a new treatment based on the data available to us.

This can have important implications for the treatment future patients receive: if the evidence collected is strong enough to conclusively suggest a survival advantage gained from a treatment, it is more likely that this treatment will be made available to those patients. Conversely, if the analysis suggests the treatment is effective but the MRC CTU at UCL are not sufficiently confident in the strength of the evidence to support this, the findings are less likely to translate into a real difference for patients.

For information, there is intention to request access to Hospital Episode Statistics (HES) data linked to this request later, which would allow the MRC CTU at UCL to understand the treatment patterns of the trial cohort. This will be subject to a new application, and thus agreement, with NHS Digital.

The linkage requested is necessary for the performance of a task carried out in the public interest; improving the treatment offered to men with prostate cancer (GDPR section 6:1(e)). Processing is also necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with section 9:2(j) of GDPR.

STAMPEDE is currently funded by Cancer Research UK’s Clinical Research Committee (formerly the Clinical
Trials Advisory Awards Committee; CTAAC; ref C547/A3804 - STAMPEDE). UCL is the trial’s Sponsor and delegates sponsorship responsibilities to MRC CTU at UCL.

Expected Benefits:

In the UK there are 48,500 new cases of prostate cancer each year and 12,000 deaths. STAMPEDE aims to provide evidence as to what is the best way of treating men with newly diagnosed advanced prostate cancer, and the trial has already led to improvements in standard care. It involves many different comparisons, and as an ongoing trial is expected to provide further important results over coming years.

Since opening in 2005 over 13 000 participants have joined the trial. MRC CTU at UCL has already reported practice-changing results that show adding docetaxel or abiraterone improve disease control and life-expectancy. Several other strategies have been tested with more results expected soon, including the results of abiraterone and enzalutamide combination and radiotherapy to the prostate in men with newly-diagnosed metastatic disease.

The benefits of utilising routine data from NHS Digital will be improvement in the timely collection of survival data, with improved accuracy, thereby gaining robust evidence contributing to the impact on healthcare of men with prostate cancer. It will also alleviate some of the reporting burden on trial centre staff.

Outputs:

The MRC CTU at UCL will create intermediate trial reports for review by the Independent Data Monitoring Committee (IDMC), who are an independent group of experts who monitor patient safety and treatment efficacy data. The IDMC usually meets a minimum of once a year per "comparison". Reports to the committee are confidential, and they are the only people to see data by randomised group while the trial is in progress. The IDMC will only see NHS Digital data that has been aggregated with small numbers suppressed. They may recommend changes to the trial, for action by the trial steering committee. Results for each comparison are triggered by a certain number of deaths in the contemporaneously randomised control arm patients for each comparison. Peer reviewed publications and high impact medical journals - either cancer-specific journal (like JCO or Lancet Oncology) or a general medical journal (like Lancet, The Journal of the American Medical Association, The New England Journal of Medicine) will be produced. MRC CTU at UCL will look to general journals first but will review the results and whether they might or might not appear to a general audience.

MRC CTU will communicate the STAMPEDE results using at least:
• Presentation at major international and national scientific conferences
• Publication in high-impact peer-reviewed journals
• A written summary of results distributed to participants
• News articles on the STAMPEDE website
• Tweets on the @MRCCTU Twitter account

MRC CTU at UCL will communicate the results to the wider patient population via articles in the Tackle Prostate newsletter, Prostate Matters.

MRC CTU at UCL will also inform Prostate Cancer UK of the results, building on the relationship MRC CTU at UCL have with them for other trials in MRC CTU's prostate cancer portfolio. If appropriate, MRC CTU at UCL will work with the MRC and UCL press offices to develop press release(s) about the results. Depending on what the results show, MRC CTU at UCL may also look at other methods of communication. For previous prostate cancer trials MRC CTU at UCL have used films, briefing papers and events to communicate the results to health-workers and patients.

MRC CTU will communicate the results to the trial participants via a lay summary which will be distributed by STAMPEDE site staff. The summary is prepared at the MRC CTU by the STAMPEDE trial team and the MRC CTU PPI Group (includes patient representatives and our Policy, Communications and Research Impact Coordinator); this is the same way that previous STAMPEDE findings were disseminated to participants, and examples of the correspondence to STAMPEDE site staff and the participant summary from earlier comparisons have been saved as supporting documents.

MRC CTU at UCL will communicate the results to the wider patient population via articles in the Tackle Prostate newsletter, Prostate Matters. MRC CTU at UCL will also inform Prostate Cancer UK of the results, building on the relationship MRC CTU at UCL have with them for other trials in MRC CTU's prostate cancer portfolio. If appropriate, MRC CTU at UCL will work with the MRC and UCL press offices to develop press release(s) about the results. Depending on what the results show, MRC CTU at UCL may also look at other methods of communication. For previous prostate cancer trials MRC CTU at UCL have used films, briefing papers and events to communicate the results to health-workers and patients.

All outputs will be aggregated with small numbers suppressed and in line with the HES Analysis Guide.

The next indicative date for data output is Summer 2020 for the Abiraterone long-term comparison (Arms A vs G) in metastatic prostate cancer patients. All outputs will be aggregated with data already held on the consenting patients. No data published will lead to individuals being identified.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract). There will not be any access to the data by any third parties.

The requested data will be used for the long-term assessment of men in both the many comparisons of the STAMPEDE trial protocol. The primary outcome measure is survival, but cause of death, which can be difficult to ascertain in men with prostate cancer, is an important secondary outcome measure. The data from NHS Digital will allow the MRC CTU at UCL to ensure that deaths are captured and included promptly by all centres and clinics involved in the trial, and to verify that all information is correct and recorded. The number of participants and the rationale for their inclusion will always be included in all presented results.

MRC CTU at UCL is not permitted to re-identify individuals under this agreement.

The data will be held on UCL’s Data Safe Haven using UCL approved computers. The Data Safe Haven is UCL’s technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital DSP Toolkit and ISO 27001 Information Security standard. Access is controlled by the Information Asset Owner, and all UCL staff complete training in confidentiality and data protection, which is renewed annually.

The processing activities are as follows:

1. The STAMPEDE Trial team will identify the trial participants for linkage to NHS Digital data, and inform the UCL MRC CTU Head of Data Management Systems (DMS; acting as the Cohort Contributor and the Data Recipient). The team will provide Study ID, date of birth and date of last visit to the Head of DMS.

2. The UCL MRC CTU Head of DMS has a secure database of Patient Identifiable Data (PID) in UCL’s Data Safe Haven; and will extract the PID and merge this with a list of Study IDs.

3. The Study ID list will be sent by the UCL MRC CTU Head of DMS to NHS Digital with the following identifiers for linkage to the requested NHS Digital data products (Demographics, Cancer Registration, Civil Registration (Deaths):
• Study ID (STAMPEDE trial participant identifier)
• NHS Number
• Date of birth
• Postcode
There is no linkage field for gender as the entire trial cohort is male.

Patients have consented for data to be shared with researchers in an anonymised or linked anonymised form (this is where the 'Linked anonymised data' are anonymous to the people who receive and hold it (e.g. a research team) but contain information or codes that would allow the suppliers of the data to identify people from it). They have also consented that personal details can be used to obtain long term follow up information from national registries.

A privacy notice detailing the linkage of trial participant information to electronic health records held by NHS Digital and other similar bodies was published on the STAMPEDE website in 2018. At the same time, two letters about 1) Ongoing participation and trial updates for participants in Arms A (joined after 15 Nov 2011), G, H, J, K and L, and 2) End of participation for participants in Arms B, C, D, E, F and Arm A (who joined before 15 Nov 2011) were sent by trial centres. These letters provided updated information to trial participants about the use of their personal data (name, postcode, NHS number) to obtain health data from NHS Digital, Public Health England and the National Cancer Registration and Analysis Service. All participants have the opportunity to withdraw from the trial if they have any objections to the use of their data in this way.

4. NHS Digital will use the supplied information to extract linked data from the requested data products, including the full date and cause of death. These pseudonymised datasets with the Study ID will be sent to the MRC CTU at UCL using the specified transfer method. The data will reside in UCL’s Data Safe Haven and will be identified by Study ID only, thus there will be no identifying personal data attached to a study number. Only defined members of the STAMPEDE trial team and MRC CTU’s methodology team will have access to Data Safe Haven for data analysis - all are substantive employees of UCL. All UCL substantive employees have completed training in data protection and confidentiality, and users of Data Safe Haven receive appropriate training before granted access.

5. NHS Digital records will be uploaded to UCL’s Data Safe Haven, an output file for trial statisticians with the study-specific trial number will be prepared, and checked to ensure there is no PID within the file. .

6. This output file is placed in a secure directory with limited access only to certain members of STAMPEDE trial team.

7. Trial statisticians undertake data cleaning/validation activities for the processing of the data for the STAMPEDE trial.

8. The data will be used as a prompt to follow-up with site to get them to complete STAMPEDE Case Report Forms.

Data provided by NHS Digital will only be accessed and processed by substantive employees of UCL.

There will be no access to data by other third parties not listed in this agreement.

All outputs produced with data provided by NHS Digital will be aggregated with small numbers suppressed and in line with the HES Analysis Guide.


Assessing the impact of the COVID-19 pandemic on vulnerable children: the DHSC-ECHILD-COVID study — DARS-NIC-381972-Q5F0V

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No, Yes (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-08-17 — 2023-08-16 2020.12 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care
  2. Civil Registration - Deaths
  3. Emergency Care Data Set (ECDS)
  4. HES:Civil Registration (Deaths) bridge
  5. Hospital Episode Statistics Accident and Emergency
  6. Hospital Episode Statistics Critical Care
  7. Hospital Episode Statistics Outpatients
  8. HES-ID to MPS-ID HES Admitted Patient Care
  9. Civil Registrations of Death
  10. Hospital Episode Statistics Accident and Emergency (HES A and E)
  11. Hospital Episode Statistics Admitted Patient Care (HES APC)
  12. Hospital Episode Statistics Critical Care (HES Critical Care)
  13. Hospital Episode Statistics Outpatients (HES OP)
  14. Birth Notification Data
  15. Community Services Data Set (CSDS)
  16. Maternity Services Data Set (MSDS) v1.5
  17. Maternity Services Data Set (MSDS) v2
  18. Mental Health and Learning Disabilities Data Set (MHLDDS)
  19. Mental Health Minimum Data Set (MHMDS)
  20. Mental Health Services Data Set (MHSDS)
  21. Mental Health Services Data Set (MHSDS) v5.0
  22. Civil Registration - Births

Objectives:

The data is requested for a programme of research relevant to the aims of the of the National Institute of Health Research Policy Research Unit for Children, Young People and Families (CPRU), within University College London (UCL).

CPRU is one of 15 NIHR Policy Research Units formed to undertake research to inform decision-making by government and arms-length bodies. CPRU works closely with the Department of Health and Social Care to determine priorities and provide evidence directly to the Secretary of State for Health, government departments and arms-length bodies, such as NHS England and Public Health England.

For this programme of research, UCL are the sole Data Controller who also process data. The London School of Hygiene and Tropical Medicine (LSHTM), the Office for National Statistics (ONS) and The Institute for Fiscal Studies (IFS) are also listed as data processors.

The study is looking at the impact of COVID-19 and lockdown on Children and young people and whether there are any differences in the health and social effects of household confinement on vulnerable children and young people when compared to other children and young people. Children and young people (CYP) who are vulnerable due to social welfare or chronic health needs are expected to experience more adverse health and social effects of the COVID-19 lockdown than other CYP.

Key concerns for services are the effects of household confinement during the COVID-19 lockdown, combined with the limited access to support from health, social care and education services. The researchers urgently need to understand what impacts COVID-19 infection and related public health responses (such as lockdown) have had on CYP, to inform strategies for the current wave of infection, and any future waves.

This study; Department of Health and Social Care - Education and Child Health Insight Linked Data - COVID (DHSC-ECHILD-COVID) builds on the Education and Child Health Insight Linked Data (ECHILD) project (DARS-NIC-27404-D5Z3F - approved), which uses linked education and HES data for four one-year cohorts amounting to two million CYP in England. The linkage under this application will be extended urgently to address the impact of COVID-19 on all CYP (linkage involving an expected 18 million CYP) and in particular vulnerable CYP as this is the group most likely to be impacted by lockdown. The researchers wish to include all children and young people (CYP) appearing in HES records from (the latest of) birth or April 1997 onwards, who are aged between 0 and 24 years in the COVID-pandemic year (hence start date for birth is the start of school year 1.9.1995).

PURPOSE

DHSC-ECHILD-COVID addresses four priority areas raised by the Department of Health and Social Care (DHSC) with the Children’s Policy Research Unit (CPRU) team relating to the secondary impacts of infection and lockdown on:

~ CYP who need safeguarding
~ poorer families
~ CYP with special educational needs
~ health inequalities

These vulnerable groups can only be reliably identified through linkage of longitudinal health, education and social care data.

For the purpose of this application 'vulnerable' can be defined as:

The researchers will draw on the published DfE definition for vulnerable children and young people. This relates to children and young people aged 0-25 years who are assessed as being in need under section 17 of the Children Act 1989 (i.e. have a child in need plan, child protection plan, or are a looked-after child), have an education, health and care (EHC) plan or have been assessed as otherwise vulnerable by educational providers or local authorities (e.g. children on the edge of receiving support or those at risk of becoming not in employment, education or training).

The researchers also explore whether children with long-term health conditions such as asthma or poor mental health, and those allocated any special educational needs (as indicators of underlying health or behavioural problems), are at greater risk of adverse impacts of infection or lockdown.
Children with indicators of vulnerability can only reliably be identified through linkage of health, education and social care data.

The researcher will focus on two specific research questions:

RQ1: What are the differences in emergency hospital contacts during the COVID-19 pandemic for vulnerable CYP compared with other CYP? Is there any evidence that differences are related to COVID-19 infection or the secondary effects of lockdown?

RQ2: What is the predicted deferred health care use and what are the long-term health, education and social care outcomes due to restrictions during the COVID-19 pandemic?

The researcher will use longitudinal linked data from hospital episodes statistics (HES), linked to education and social care data (held by DfE) to assess the impact of the COVID-19 pandemic on CYP and in particular vulnerable CYP. As vulnerable CYP are hard to identify in healthcare records, the researcher will identify these CYP through administrative data histories of ever being a Child in Need (CiN), having special educational needs (SEN), a chronic health condition requiring hospitalisation, or combinations of these exposures. The researcher will derive these vulnerability indicators from a linked longitudinal dataset comprising social care, education and hospital records (HES) for all CYP in England. Examining health data from the time of birth to current age (up to, but not including, age 25 years) is critical for identifying markers of vulnerability in administrative data. For example, previous work completed by UCL has shown that chronic underlying conditions, or congenital disorders associated with special education needs may not be recorded at every admission (e.g., asthma may not be recorded when a child is admitted for an operation) and UCL have demonstrated the added value of using the whole longitudinal record.

To enable the analyses to address these research questions, the researcher will link HES data (i.e. HES APC, outpatient, critical care, A&E and ECDS data, plus death registration data) to administrative data contained in the datasets collectively supplied within the National Pupil Dataset (NPD), provided by DfE (the researcher refers to NPD data as education, CiN, and children looked after (CLA)). These datasets (HES-NPD) will be linked by NHS Digital for children and young people in England using pseudonymised linkage keys.

The legal basis for processing personal data for this purpose data at UCL falls under Article 6(1)(e) of the General Data Protection Regulations (GDPR), i.e. “a task carried out in the public interest”. It also falls under Article 9(2)(j), “processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”. The processing of data for this study is a task of public interest as it will provide evidence on the effect of the COVID-19 pandemic on health outcomes and use of healthcare services among vulnerable children. This will benefit and inform policy makers, service providers, vulnerable children and their families.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Expected Benefits:

This project aims to produce urgent results on the impact of COVID-19 infection and lockdown on the health of children and young people and in particular vulnerable children, characterised by education and social care indices from the NPD linked datasets. The study will provide vital understanding of the repercussions of the current response strategies on the health and well-being of key population groups, and provide insight into how infection control and lockdown strategies should be developed to better meet the needs of children and young people. These results (preliminary results in October 2020 and January 2021) are critical for addressing current health needs arising from COVID-19 infection and responses, and also for informing strategies for future waves of infection.

The study will compare different groups of vulnerable and non-vulnerable groups of children and young people, using indicators of vulnerability drawn from health, social care and education histories in administrative data. The analyses will address a priority for DHSC policy makers, that COVID-19 and lockdown have resulted in disproportionate impacts on health for some groups. The analyses aim to explore this question for the whole population, to inform policy to better support children and young people, and to better understand which types of vulnerability are most affected.

The study will examine which groups of children and young people did present to services during the lockdown, whether their problems were directly related to COVID19 or the secondary impact of lockdown, and what underlying health or social risk factors were present. The study also aim to understand unmet need and predict future health needs for the large proportion of CYP who would have been expected to present to services, based on past patterns of care, but did not attend during the pandemic. For example, despite messages to urge patients requiring urgent medical treatment to seek care through the appropriate channels (e.g. A&E), during the pandemic there was a dramatic and unexplained decrease in A&E attendances. Serious concerns have been raised about the impact of the resulting treatment delays, yet much more evidence is required in to quantify the scale of the problem in different population groups and to predict future/ongoing needs.

The research seeks to help to fill this gap, firstly by evaluating high level differences in impacts for groups of vulnerable (in terms of clinical, socio-demographic and educational needs) versus other children and young people, which will guide the development of more detailed, in depth research within groups for which there is evidence of the most significant adverse impacts. Specific examples include examining the impact of delays in time-sensitive procedures, where delays are expected to have significant and prolonged negative impacts on health and education outcomes (e.g. surgical correction of cleft lip and palate). The results will establish the scale and urgency (e.g. how many children, how extensive were the delays, what are likely the unmet healthcare and education needs) of these impacts and guide the development of reactive policies and changes in service provision to mitigate long-term impacts for specific population groups. The research will also add evidence on the impact of COVID-19 on health inequalities - including for black and minority ethnic groups – and the mechanisms that drive these. The comprehensive geographical coverage and population base of our research is a real strength of this research and will allow the researchers to draw conclusions for all vulnerable children and young people in England, and to identify groups that are being failed by current policy and services.

The project is commissioned and funded by DHSC, and findings will be reported directly to NHS policy makers. The project also addresses two priorities on the impact of COVID19 on vulnerable patients set out by Health Data Research UK. The study will report preliminary results to DHSC policy makers through our regular 2-monthly meetings, through seminars with wider NHS staff (DHSC, PHE and NHS England – Simon Kenny, NHSE clinical director), and through briefing reports and papers published in peer reviewed journals.

Outputs:

All outputs will contain aggregate level data only and all small numbers will be suppressed in line with the HES analysis guide. Outputs will be monitored for compliance with ADRN statistical output controls and the HES Analysis Guides. No potentially disclosive outputs will be shared or published.

Preliminary reports will be shared with DHSC, PHE and NHS England, and with DfE through the ECHILD project and a project advisory group. Preliminary results will be produced for October 2020 (RQ1) and January 2021 (RQ2).

The researcher will submit full reports for fast-track publication in peer reviewed journals and produce briefing reports for DHSC, DfE and other public bodies through the Children and Families Policy Research Unit (CPRU). Findings will also be used in public involvement and engagement events. Study findings will be also disseminated through peer-reviewed academic journals (e.g. BMJ, Lancet Public Health), and social media including lay summaries.

UCL would expect to present findings at conferences such as the Lancet Public Health conference, and International Population Data Linkage Conference within two years of obtaining the data.

Relevant findings will be shared with policy makers, clinicians/health professionals, educators and parent groups particularly in accessible formats (e.g. lay summaries, videos or animations). This could include forums such as the National Children's Bureau (NCB) Young Person and Parent group, the Great Ormond Street Hospital (GOSH) Patient Engagement group. These groups can be accessed through the joint institute of UCL Great Ormond Street Hospital Institute of Child Health, through the North Thames ARC (led by UCL) and through the CPRU. Lay summaries of the study findings can be published on the CPRU website, and linked through websites for these organisations.

The data analyses are conducted on the ONS Secure Research Service. Detailed individual level child data cannot leave the ONS Secure Research Service. Results of analyses can be exported by a secure encrypted transfer system, which is audited.

Any outputs from analyses that are published have to meet statistical disclosure controls that prevent small sizes in accordance with NHS Digital and DfE requirements. Tabulations of aggregate data are assessed for statistical disclosure control and authorized for export by an ONS data scientist not involved in the project.

Processing:

The application shows data held, this data is what is held under NIC-393510. This agreement (NIC-381972) will allow the data to be linked to that which is already held at UCL. The only data requested under NIC-381972 is a record identifier which allow the customer to link those matched with the ECDS data held under NIC-393510, this is because there are no HES IDs in ECDS.

Analyses for RQ1:

The researcher will analyse outcomes for all CYP, and compare differences in health outcomes (e.g. emergency hospital contacts, deaths) between groups classified as vulnerable or not (stratified by age group), in the years before, compared with after, COVID-19 onset. Descriptive analyses will explore whether changes post-COVID-19 reflect increased risks of contacts related to infection, mental health, adversity, or acute complications of chronic or complex health conditions for all CYP, and according to whether they had indices for vulnerability or not. The researcher will use diagnostic and procedure codes, and types of hospital contact (eg emergency/elective admission), recorded in HES, to infer whether hospital contacts are related to underlying chronic conditions. Preliminary results October 2020.

Analyses for RQ2:

The researcher will model expected healthcare contacts for all CYP after the onset of the COVID-19 pandemic, based on observed trajectories of healthcare contact for CYP for periods prior to the start of COVID-19. The researcher will assess potential unmet healthcare need, according to age and vulnerability status, by comparing expected vs observed healthcare contacts after COVID-19 onset. Differences between expected and observed new diagnoses and interventions, will be used to infer potential unmet need. The researcher will vary their assumptions about whether and when these healthcare needs may manifest, and whether deferred presentation is likely to be more severe. The researcher will estimate how much deferred healthcare use could add to predicted rates of healthcare contacts in the post-COVID-19 period, and potential long-term outcomes of COVID-19 restrictions. Preliminary results in Jan 2021.

The researcher will also conduct analyses of all CYP to explore associations between inequalities (using index of multiple deprivation), ethnic group, and vulnerable vs other CYP, and outcomes measured in health care, NPD data (i.e. education and CiN/CLA) in periods before, during and after the COVID-19 pandemic. The researcher will refresh the linked NPD datasets annually in 2021 and 2022 to evaluate the longer-term impacts of COVID-19 infection and response on health, education and social care outcomes for vulnerable, compared with other CYP.

To address these research questions, HES (APC, A&E, OPD, ECDS, critical care) and death registration data (HES-mortality data) provided by NHS Digital will be linked to administrative data from the National Pupil Dataset (NPD) provided by the Department for Education (DfE) to the ONS SRS. Using a pseudonymised linkage key, these datasets will be linked for all children and young people (CYP) born in England on or after 1.9.1995.



The full cohort (longitudinal data for all children and young people in England) is justified for the following reasons:

1) Longitudinal coverage:

Examining health data from the time of birth to current age (up to, but not including, age 25 years) is critical for identifying markers of vulnerability in administrative data. For example, previous work completed by UCL has shown that chronic underlying conditions, or congenital disorders associated with special education needs may not be recorded at every admission (e.g., asthma may not be recorded when a child is admitted for an operation) and UCL have demonstrated the added value of using the whole longitudinal record. The researchers have requested the minimum data necessary for their research, which reflects the administrative history of the child for a subset of the available fields (e.g. the researchers have requested 60% of available inpatient fields, with no sensitive or identifiable fields).

2) Geographical coverage:

The research aims to draw conclusions that are valid for all children and young people in England. Yet the pandemic has had differential impacts across the country (reflecting both infection rates and public health responses) at different times. For example surveys (e.g. RCPCH) indicate geographical heterogeneity, including re-routing/re-deployment of healthcare staff and services, uptake of school access by eligible children, which are likely to disproportionately impact on areas with higher levels of overcrowding, less outside space, and greater deprivation. However, many surveys have incomplete coverage by geography or over time, making it difficult to accurately estimate the scale of the problem. Understanding time-varying patterns of change is increasingly important as public health responses shift towards localised management (e.g. local lockdowns) to control spread. The researchers therefore need data that makes it possible to understand local area impacts. The researchers have requested the minimum granularity possible, for example by requesting MSOA rather than LSOA.

3) Cohort:

The research focuses on the impact of the pandemic and lockdown on vulnerable children and young people (further defined below). Reliable identification of children meeting this definition is not trivial, requiring longitudinal data from birth across health, education and social care. As a result there are relatively few robust estimates of the size of this population.

However, there is good evidence that these indicators of vulnerability are common. For example, new research estimates that 25% of all children are ever designated a child in need and that 44% are ever referred to children’s social care before the age of 16 years. A further subset of children will have other indicators of vulnerability reflecting health or educational needs. In order for the research to draw meaningful conclusions the researchers wish to draw comparisons between the impact of the pandemic and lockdown on different groups of vulnerable children relative to a series of control children. The researchers will draw high level comparisons (e.g. to all other children) relevant to evaluating impacts at national level and for international comparison, as well as detailed comparisons against synthetic control groups (e.g. through propensity score matching) to better understand the impacts of vulnerability in the context of related factors such as local environment, access to schools and healthcare needs.

The researchers therefore require data for all children and young people in England as without these data our comparisons would be incomplete, at greater risk of selection bias and not generalisable.













Linkage of identifiers from HES-mortality data and NPD will be conducted at NHS Digital which will then only transfer the pseudonymised linkage key to the UCL Data Safe Haven to flag linked records in the UCL-curated HES extract for transfer to the ONS Secure Research Statistics (SRS). NPD attributable data will only be available in the ONS SRS. Using the pseudonymised linkage key, linkage of pseudonymised attribute data (clinical or education characteristics) will then occur separately, at the ONS Secure Research Service (SRS). The following outline describes the complete data flow and details how identifiable and non-identifiable data extracts will be handled:

1) DfE will supply the Trusted Third Party (NHS Digital) with a list of NPD identifier variables, these identifiers include name, date of birth, full postcode and sex, alongside a study specific pseudonymised linkage key known as the anonymized Pupil Matching Reference (aPMR). The identifying variables will be used for linkage to the Personal Demographic Service (PDS) (as previously done for NIC 27404). DfE will transfer the variables for any CYP born on or after cohort inception (1.9.95).

2) NHS Digital will match the identifiers from DfE to records held in the PDS using an algorithm that makes use of the chronology of postcodes in NPD and PDS. Matching to PDS data will be done internally within NHS Digital, no PDS data will be disseminated to ONS SRS or UCL Data Safe Haven. NHS Digital will link the the NPD pseudonymised linkage key (i.e. anonymised PMR or young person ID) to PDS, and then to the ECDS data.

3) For those CYP whose NPD identifiers were matched to PDS, onward linkage to HES-mortality data will occur within NHS Digital to link aPMRs and HES-IDs. NHS Digital will then transfer encrypted HES-IDs, aPMRs, and indicators of match rank (denoting the step at which the match to HES and PDS was made) for these linked cases to the UCL Data Safe Haven for linkage to the existing HES-mortality extract held by UCL (NIC-393510).

4) UCL will extract the HES-mortality data for all CYP born on or after 1.9.1995, using the existing HES extract (NIC 393510), and link the aPMR and match rank statistics for those CYP that were linked by NHS Digital in step (3) from NPD. The de-identified HES-mortality extract will be transferred to the ONS SRS. Only month/year of birth and death will be transferred to the ONS SRS, in order to account for well-established effects of month of birth on school achievement (i.e. research consistently shows that children born in September do better than children born in July/August). The attribute data will also include high-level categorical maternal indicators relevant to birth (e.g. parity – 0, 1, 2+ prior births), which have an important bearing on child health. These derived health indicators are created by analysing variables in our analysis files, that are covered by this DSA (e.g. diagnosis codes and baby tail) and for which we have the appropriate permissions.

5) DfE will supply ONS SRS with requested de-identified attribute data extracts, with the aPMR for all CYP born on or after 1.9.1995. The deidentified attribute NPD and HES data will be linked within the ONS SRS by the research team, using the aPMR. Data will only be used by researchers authorised for the project, with strict output controls applied by ONS SRS staff.

6) The final data set that will be used for analyses will remain within the ONS SRS. The files will not contain any identifiable data. No additional record level data will be gathered or linked to the dataset. The aPMR is the only variable supplied from NPD data that is supplied by NHS Digital to UCL Data Safe Haven and then to ONS SRS.

7) NHS Digital will retain the identifier file of all individuals linked in NPD-PDS and PDS-HES and all the postcodes used in linkage and postcode dates for 12 months to address data queries or potential linkage errors. This data set will not contain any attribute data and will be accessible only to NHS Digital staff. At the end of the 12 months, NHS Digital will confirm deletion of the data to DfE. NHS Digital will not send confidential data to DfE or UCL DSH.

LSHTM and IFS will each have a named researcher and PI who will undertake analyses (on the ONS SRS) relating to a specific component of the wider research question on the impact of the pandemic on vulnerable children and young people. e.g. LSHTM will examine the impact (on health and education) of delays in time-sensitive procedures (e.g. surgical correction of cleft lip and palette) on children with underlying health conditions. These named researchers will be substantive employees of the respective organisations.

The de-identified linked HES-NPD attribute data will be held on the ONS SRS and will only be accessible remotely from the UCL Data Safe Room which has restricted and monitored access. No record level data can be removed from the ONS SRS and statistical disclosure controls are applied by ONS staff. Access will be restricted to named users, who are part of the study team and are accessing the data for the purposes outlined in this DSA. Access to the data is via the ONS SRS environment.

Office of National Statistics (ONS) and UCL have signed and maintain an organisational agreement to use the ONS Secure Research Statistics (SRS) service for the purposes of secure statistical research, signed on 21/03/2019 with an indefinite expiry date. The only HES data stored in the ONS SRS are the 4 one-year cohorts of HES, plus the anonymised PMRs for those records that link to NPD. The HES data will include month of death and month of birth so no identifying data.

For security and resource reasons the SRS is a Managed Service. Equiniti Ltd (based in Belfast) maintains the system, on behalf of the ONS SRS. They do so through encrypted (TLS1.2) VPN tunnel and Remotely Access (RA) the SRS. All Equiniti Ltd administrators are SC cleared and have no access to any data. ONS SRS Research Support “Admin” staff only have permissions to carry out such tasks as creating users, updating patches, testing and installing software applications, arranging DR, ITHC for the SRS environment, closing SRS sessions down, i.e. all the SRS environment Admin maintenance - essentially they are “power users”. There have been no data infractions by Equiniti Ltd staff in the last 5 years of them maintaining the environment, they have been very professional.

The high level security document that Equiniti Ltd provided states: 6.3. Service Management support for the SRS Service is provided from Equiniti offices in Belfast, all staff are SC cleared. The office hosting the SRS Desk is IS0/IEC27001 2018 certified. Equiniti Ltd nor any their staff process the data. Therefore Equinity Ltd is not considered to be a Data Processor.

The ONS SRS environment is an isolated system. It has no connectivity to the internet other than using it as a bearer to pass TLS1.2 encrypted image packages for a virtual desktop infrastructure (VDI), hosted on an accredited cloud server hosted by UKCloud Ltd on the mainland UK. UKCloud Ltd merely host the environment, they have no access to data. Therefore CloudUK Ltd is not considered to be a Data Processor.

CLOUD SECURITY
NHS Digital security has provided assurance regarding the use of the Office of National Statistics' Secure Research Statistics service (ONS SRS), hosted by CloudUK Ltd in this application. The Office of National Statistics has submitted a selection of security documentation to support the use of cloud storage. NHS Digital Security have reviewed the documentation and provided relevant feedback, where necessary. NHS Digital are satisfied that the documentation demonstrates the level of security and governance in place.

The Office of National Statistics have supplied evidence to support:
• The use of the Data Risk Model to assess the Risk Profile Class.
• Risk Management of the use of the Cloud for this data, taking into consideration Confidentiality, Integrity and Availability.
• The use of Pseudonymisation.
• Board level involvement in the Risk Management Process evidenced through Minutes of these meetings.
• Understanding of the Shared Responsibility Model

The Office of National Statistics have a very good understanding of the security controls available to them to provide the appropriate controls to secure data in the Cloud.

Using the Cloud, benefits from the inherited controls that cannot practically be replicated locally such as Physical Controls, Resilience of Systems, Power Supplies, Communications and Geographically dispersed Data Centres within a region.

Elasticity in provisioning is also a consideration that benefits organisations in managing workloads. The Cloud provider, CloudUK, will use UK Data Centres only.


MR472B - SABRE: Southall and Brent Revisited - Consented participants — DARS-NIC-148407-LRP3M

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - consent provided by participants of research studY, No - data flow is not identifiable, Anonymised - ICO Code Compliant, Identifiable, No (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012, Other-Data originally supplied on the basis of National Health Service Act 2001 – s60, and subsequently National Health Service Act 2006 - s251., Other-Data originally supplied on the basis of National Health Service Act 2001 – s60. Subsequent data releases under Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007., Other-Data originally supplied on the basis of National Health Service Act 2001 – s60, and subsequently National Health Service Act 2006 - s251 - 'Control of patient information'. | New data to be disseminated on the basis of Informed Patient consent to, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(2)(c), Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 – s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-04-01 — 2022-03-31 2017.12 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care
  2. MRIS - Flagging Current Status Report
  3. MRIS - Cause of Death Report
  4. MRIS - Cohort Event Notification Report
  5. MRIS - Members and Postings Report
  6. Demographics
  7. Civil Registration - Deaths
  8. Cancer Registration Data
  9. MRIS - Scottish NHS / Registration
  10. MRIS - List Cleaning Report
  11. Hospital Episode Statistics Admitted Patient Care (HES APC)
  12. Civil Registrations of Death

Objectives:

The data supplied by the NHS IC to University College London will be used only for the approved Medical Research Project MR472.

Yielded Benefits:

The research has also enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). Findings include: - Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity. - Lack of adherence to four combined health behaviours was associated with a 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours. - The study has also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. - The study reported detrimental associations between air pollution (particulate measures) and cardiovascular disease mortality in both the SABRE and NHSD cohorts. The study highlighted ethnic differences in associations between prediabetes in midlife and later development of coronary heart disease and stroke. - The study has confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Expected Benefits:

The rich phenotypic and genotypic dataset gathered over a 25 year period will enable analyses assessing mid-life predictors of health and ill-health in older age and will enable unique analyses of how these associations may be related to ethnicity and migration. Good physical and cognitive functions are vital to healthy ageing and factors which influence these across the life course are poorly understood, particularly in non-European populations. As the cohort is reaching older age, an increase in risk of heart failure, which can be severely debilitating, is expected. Ethnic differentials in heart failure rates are not well studied to date. Increasing length of follow-up and novel analytic techniques, both statistical and relating to stored images and samples bring opportunities for more sophisticated analyses and the addition of hospital admission data to key outcome variables enhances the study’s power to identify events and to further elucidate mechanisms underlying the very marked ethnic differences in cardiometabolic disorders which were observed at visit 2.

Understanding of mechanisms in people of different ethnicities will ultimately lead to appropriate preventive strategies and treatments at different stages of life.

As noted with some detailed examples under ‘specific outputs’, previous use of HES data (1989-2011) enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). A brief summary of some of these findings in the cohort to 2011 follows:

Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity.

Lack of adherence to four combined health behaviours was associated with 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours.

The study also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. This is not an exhaustive list of study findings in relation to incident cardiometabolic disease but indicates that the study is building a steady accumulation of understanding of ethnic differentials. There is clear need for further study, which this cohort is uniquely able to address. The addition of HES data to 2016 is key to maximising event ascertainment in old age.

Outputs:

Study findings will continue to be published in peer-reviewed scientific journals, predominantly related to epidemiology, cardiovascular and metabolic disorders, cognitive, physical and psychological function, but also including more generic journals such as the BMJ, reflecting the increasing focus on overall health in older age. Publications will contain only aggregate level data without local identifiers and with suppression of small numbers in line with HES analysis guide.

Publications to date are listed on the study website: www.sabrestudy.org. All publications since 2008 are open-access. The audience is expected to consist mainly of academic researchers and clinicians.

Two examples of previous SABRE study related publications are listed below, both sets of analyses were importantly informed by data from a previous HES extract (no longer retained), and were published in high-impact factor peer-reviewed journals. These generated considerable media interest and are widely cited.

Tillin T, Hughes AD, Mayet J, Whincup P, Sattar N, Forouhi NG, McKeigue PM, Chaturvedi N. The relationship between metabolic risk factors and incident cardiovascular disease in Europeans, South Asians and African Caribbeans. SABRE (Southall and Brent revisited) – a prospective population based study. J Am Coll Cardiol. 2013 Apr 30;61(17):1777-86. http://dx.doi.org/10.1016/j.jacc.2012.12.046. This paper published in JACC, the world no 1 cardiology journal (impact factor 16.5) confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data, although not directly reported in the manuscript, contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Tillin T, Hughes AD, Godsland IF, Whincup P, Forouhi NG, Welsh P, Sattar N, McKeigue PM, Chaturvedi N. Insulin resistance and truncal obesity as important determinants of the greater incidence of diabetes in Indian Asians and African Caribbeans compared to Europeans? The Southall And Brent REvisited (SABRE) cohort. Diabetes Care 2013;36(2)(383-393). http://care.diabetesjournals.org/content/36/2/383.long. This paper demonstrated the extraordinarily high risk of incident diabetes continuing into old age in South Asians and African Caribbeans in comparison with Europeans. Metabolic pathways leading to diabetes remain poorly understood. The study found that baseline insulin resistance and truncal obesity could explain the ethnic differences in women but not in men. Further work continues to determine the reasons for the excess risk in men and to understand what underlies insulin resistance and truncal obesity. HES data, although not directly reported in this manuscript, supported these analyses by enabling sensitivity analyses to assess the effects of bias due to loss to follow-up.

A further 8 journal publications have examined associations between baseline risk factors and incident coronary heart disease or stroke where the outcomes were a composite of first events identified through participant reported events, primary care record review identified events and HES identified hospital admissions. One of these was published in Circulation (Wurtz et al), impact factor 14.3 and demonstrated in 3 separate population based studies (including SABRE) that metabolite profiling in large prospective cohorts identified phenylalanine, monounsaturated fatty acids, and polyunsaturated fatty acids as biomarkers for cardiovascular risk, substantiating the value of high-throughput metabolomics for biomarker discovery and improved risk assessment. Another publication in Heart (Tillin et al) identified that 2 widely used cardiovascular risk prediction tools (QRISK2 and Framingham) did not perform consistently well in all ethnic groups and suggested that further validation of QRISK2 in other multi-ethnic datasets, and better methods for identifying high risk African Caribbeans and South Asian women, are required.

In addition to journal publications, UCL will continue to submit abstracts for presentation at national and international conferences, such as Diabetes UK, the European Association for the Study of Diabetes, the European Society of Cardiology, Artery, and the British Hypertension Society. All data for abstracts/presentations will be at aggregate level with suppression of small numbers in line with HES analysis guide.

The study team will further disseminate findings via participant and GP feedback sessions; newsletters, and the study website. All data for these occasions will be at aggregate level with suppression of small numbers in line with HES analysis guide.

At the end of the current funding period (2018) a report will be submitted to the funders (the British Heart Foundation) summarising findings. This may be published on their website. It will only contain at most aggregate level data, with small numbers suppressed in line with HES Analysis Guide.

Processing:

The identifiers of SABRE participants have previously been shared with NHS Digital’s predecessor organisation(s) and NHS Digital has provided regular event notifications including notifications of mortality and cancer registrations. The cohort was previously split into two groups: cancer notifiable participants and non-cancer notifiable participants.

The cohort will be reorganised into three groups: participants who gave informed consent (cancer notifiable); cancer notifiable participants covered by section 251 support, and non-cancer notifiable participants covered by section 251 support. To ensure that participants are correctly reorganised into the appropriate groups, UCL will send NHS Digital 3 separate files (one for each respective group) containing participant identifiers.

NHS Digital will then provide reports on a monthly basis while the study is in active follow-up. Notifications will contain no participant identifiers other than unique study Pseudo-IDs. Month and Year of Death will also be included.

NHS Digital will link the respective cohort groups to HES data and will supply to UCL encrypted files containing hospital admissions data identified only by study Pseudo-ID and encrypted HESID and containing no other identifiers. The dataset will be placed immediately into UCL’s Data Safe Haven.

Using the Pseudo-ID, the data is linked at record level to the existing dataset of mortality and cancer records, clinical measures, primary care record review and participant responses to health and lifestyle questionnaires across the course of the study. The data is stored in an encrypted file within the Data Safe Haven at the Gower Street location. The data can be remotely accessed at the Institute of Cardiovascular Science by accredited SABRE study researchers only – all of whom are substantive employees of UCL. Access must be approved by the Data Manager.

The data supplied by NHS Digital will not be downloaded or otherwise transferred from the Data Safe Haven. Data including variables derived from the NHS Digital data may be downloaded from the Data Safe Haven and stored on a UCL server at the Institute of Cardiovascular Science to be used solely for the purposes of statistical analyses in accordance with the study objectives. Such variables include, for example, date of first admission related to a diagnosis of coronary heart disease but will not include any part of the dataset supplied by NHS Digital. Using this pseudonymised dataset, study analysts will examine associations between risk factors measured during the course of the study and cardiometabolic events. The rich phenotypic and genotypic dataset will enable identification of ethnic differences in cardiometabolic disease risk and physical, mental and cognitive function into older age and it will be possible to identify which measured risk factors may explain ethnic differentials and at which period of life they may act most strongly.

To meet study objectives UCL require information on admissions where diagnostic code lists include coronary heart disease, stroke, heart failure, diabetes, renal failure, dementia, retinopathy, hypertension, other cardiovascular disease. Respiratory diseases will also be studied and mental health disorders and other common disorders may be added which are considered to exert important influences on function and well-being in older age. As an example, from the HES extract, and within the UCL Data Safe Haven, it is expected that a variable will be generated which identifies a first or subsequent admission with coronary heart disease (ICD-9 codes 410 through 415 or ICD-10 codes I200 through I259, or any of the following operation codes from the Office of Populations and Surveys classification of interventions and procedures: K401 through K469, K491 through K504, K751 through K759, or U541 (coronary revascularization interventions or rehabilitation for ischemic heart disease)). Date of first or subsequent event would be summarised as year of event.

The data is stored separately to participant identifiers. The two datasets will not be re-linked and the data will remain pseudonymised as described above. Month and Year of Death are stored in the dataset and used for statistical analyses but the dataset does not include full Date of Death. Participant identifiers are retained separately solely for study administration purposes.


MR472A - SABRE: Southall and Brent Revisited - S251 participants not cancer notifiable — DARS-NIC-99077-Q0K6Z

Type of data: information not disclosed for TRE projects

Opt outs honoured: Yes - patient objections upheld, Anonymised - ICO Code Compliant, Identifiable, Yes (Section 251 NHS Act 2006)

Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012, Section 251 approval is in place for the flow of identifiable data, National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-04-01 — 2022-03-31 2017.09 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - Cause of Death Report
  2. MRIS - Cohort Event Notification Report
  3. Hospital Episode Statistics Admitted Patient Care
  4. MRIS - Members and Postings Report
  5. Civil Registration - Deaths
  6. Demographics
  7. MRIS - Flagging Current Status Report
  8. MRIS - List Cleaning Report
  9. Hospital Episode Statistics Admitted Patient Care (HES APC)
  10. Civil Registrations of Death

Objectives:

University College London (UCL) requires notifications of mortality and linked HES data for its study cohort for use in the Medical Research Project: SABRE (Southall And Brent Revisited. This is a population-based cohort study, conducted at University College London, funded by the British Heart Foundation in its current 25 year follow-up phase. It is unique as a long-standing tri-ethnic cohort consisting of people of European descent and first generation migrants of South Asian or African Caribbean descent. This is an academic research study focusing on identifying and understanding the underlying reasons for ethnic group and sex differences in cardiometabolic disease and in physical, psychological and cognitive function in older age.

Specific questions for the 25 year follow-up study are:
1. How large are ethnic /sex differences in cardiac function, cognitive function and hippocampal volumes in older age?
2. To what extent do cardiac function, cognitive function and hippocampal volumes change over a 5 year period in each ethnic group?
3. Which risk factors measured in mid-life and in early old age are most strongly associated with current cardiac and cognitive function and hippocampal volumes and with 5 year changes in these parameters? Can these risk factors explain ethnic differences in cardiac and cognitive function?
4. How large are gender differences in current disorders of cardiac and cognitive function and in their associations with current risk factors?
5. Do ethnic differences in incident cardiometabolic disorders persist into older age?
6. Which risk factors or risk factor profiles measured in mid-life and early old age are most strongly associated with incident cardiometabolic disorders and which best explain ethnic differences in incidence?

The study receives ongoing notifications of mortality from NHS Digital. Continuing supply of this data is required in order to meet study objectives. Death and cause of death are key outcomes for the research objectives.

The study has previously utilised the List Cleaning service from time to time when in active follow-up in order to ensure that the correct participant addresses are used in order to contact participants. Use of this service has helped the study to avoid trying to contact deceased participants. The List Cleaning outputs were used to update the administration database (held separately from other data within the UCL data safe haven) so that UCL could write to as many participants as possible inviting them to complete questionnaires or come into the UCL clinic for a detailed investigation. Under this Data Sharing Agreement, UCL may retain List Cleaning outputs received previously but is not permitted to make further use of the List Cleaning service.

Linked HES data is required to identify incident cardiometabolic events (in particular coronary heart disease, heart failure, stroke, dementia, diabetes), and other events which may affect physical and cognitive function, which have occurred during the follow-up period. Details of all hospital episodes involving the cohort (not limited to the previously stated conditions) are required to address key study objectives with regard to physical and cognitive function in older age in association with current and mid-life risk factors. Analysis needs to consider any and all potential contributing factors.

These events will supplement information provided by participant self-report at 20 and 25 years, from primary care medical record review conducted during the 20 year follow-up and from mortality flagging, together with detailed clinical measurements made at the SABRE clinics at baseline, 20 and 25 year follow-up. The SABRE cohort is increasingly elderly (median age of survivors in 2016=77 years, range 65-98) and at visit 3, although many are willing and able to visit UCL’s clinic and/or to complete questionnaires, many who attended at the last follow-up 5 years ago are now too frail or unwell to attend the 25 year follow-up clinic or to complete the health and lifestyle questionnaires, and sadly many have died (approximately 1,500 (31%)). Diagnosis of disease events/states identified during admission to hospital is increasingly important in assessing health in this elderly cohort and will inform all key event outcomes. This is particularly important in assessing health in those otherwise lost to direct follow-up. The data will be used to analyse risk factors measured in mid- and later life in association with these incident events in order to build on current understanding of causal mechanisms.

Data from 1989 to the present is required because participants underwent detailed examinations at baseline (1989-91) and the aim is to follow this cohort through their experiences since to understand what happened in later life and relate that to the baseline. This will enable UCL to gain as complete as possible a picture of hospital admissions, and hence incident events, over the entire cohort follow-up. Data from the entire study period are crucial for determining age of onset of events, as well as the extent and nature of ill-health from mid to later life, and for relating these to current and mid-life cardiometabolic and other risk factors and how these influence the key study outcomes of physical and cognitive function in older life in each of the three ethnic groups.

Yielded Benefits:

The research has also enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). Findings include: - Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity. - Lack of adherence to four combined health behaviours was associated with a 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours. - The study has also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. - The study reported detrimental associations between air pollution (particulate measures) and cardiovascular disease mortality in both the SABRE and NHSD cohorts. The study highlighted ethnic differences in associations between prediabetes in midlife and later development of coronary heart disease and stroke. - The study has confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Expected Benefits:

The rich phenotypic and genotypic dataset gathered over a 25 year period will enable analyses assessing mid-life predictors of health and ill-health in older age and will enable unique analyses of how these associations may be related to ethnicity and migration. Good physical and cognitive functions are vital to healthy ageing and factors which influence these across the life course are poorly understood, particularly in non-European populations. As the cohort is reaching older age, an increase in risk of heart failure, which can be severely debilitating, is expected. Ethnic differentials in heart failure rates are not well studied to date. Increasing length of follow-up and novel analytic techniques, both statistical and relating to stored images and samples bring opportunities for more sophisticated analyses and the addition of hospital admission data to key outcome variables enhances the study’s power to identify events and to further elucidate mechanisms underlying the very marked ethnic differences in cardiometabolic disorders which were observed at visit 2.

Understanding of mechanisms in people of different ethnicities will ultimately lead to appropriate preventive strategies and treatments at different stages of life.

As noted with some detailed examples under ‘specific outputs’, previous use of HES data (1989-2011) enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). A brief summary of some of these findings in the cohort to 2011 follows:

Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity.

Lack of adherence to four combined health behaviours was associated with 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours.

The study also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. This is not an exhaustive list of study findings in relation to incident cardiometabolic disease but indicates that the study is building a steady accumulation of understanding of ethnic differentials. There is clear need for further study, which this cohort is uniquely able to address. The addition of HES data to 2016 is key to maximising event ascertainment in old age.

Outputs:

Study findings will continue to be published in peer-reviewed scientific journals, predominantly related to epidemiology, cardiovascular and metabolic disorders, cognitive, physical and psychological function, but also including more generic journals such as the BMJ, reflecting the increasing focus on overall health in older age. Publications will contain only aggregate level data without local identifiers and with suppression of small numbers in line with HES analysis guide.

Publications to date are listed on the study website: www.sabrestudy.org. All publications since 2008 are open-access. The audience is expected to consist mainly of academic researchers and clinicians.

Two examples of previous SABRE study related publications are listed below, both sets of analyses were importantly informed by data from a previous HES extract (no longer retained), and were published in high-impact factor peer-reviewed journals. These generated considerable media interest and are widely cited.

Tillin T, Hughes AD, Mayet J, Whincup P, Sattar N, Forouhi NG, McKeigue PM, Chaturvedi N. The relationship between metabolic risk factors and incident cardiovascular disease in Europeans, South Asians and African Caribbeans. SABRE (Southall and Brent revisited) – a prospective population based study. J Am Coll Cardiol. 2013 Apr 30;61(17):1777-86. http://dx.doi.org/10.1016/j.jacc.2012.12.046. This paper published in JACC, the world no 1 cardiology journal (impact factor 16.5) confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data, although not directly reported in the manuscript, contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Tillin T, Hughes AD, Godsland IF, Whincup P, Forouhi NG, Welsh P, Sattar N, McKeigue PM, Chaturvedi N. Insulin resistance and truncal obesity as important determinants of the greater incidence of diabetes in Indian Asians and African Caribbeans compared to Europeans? The Southall And Brent REvisited (SABRE) cohort. Diabetes Care 2013;36(2)(383-393). http://care.diabetesjournals.org/content/36/2/383.long. This paper demonstrated the extraordinarily high risk of incident diabetes continuing into old age in South Asians and African Caribbeans in comparison with Europeans. Metabolic pathways leading to diabetes remain poorly understood. The study found that baseline insulin resistance and truncal obesity could explain the ethnic differences in women but not in men. Further work continues to determine the reasons for the excess risk in men and to understand what underlies insulin resistance and truncal obesity. HES data, although not directly reported in this manuscript, supported these analyses by enabling sensitivity analyses to assess the effects of bias due to loss to follow-up.

A further 8 journal publications have examined associations between baseline risk factors and incident coronary heart disease or stroke where the outcomes were a composite of first events identified through participant reported events, primary care record review identified events and HES identified hospital admissions. One of these was published in Circulation (Wurtz et al), impact factor 14.3 and demonstrated in 3 separate population based studies (including SABRE) that metabolite profiling in large prospective cohorts identified phenylalanine, monounsaturated fatty acids, and polyunsaturated fatty acids as biomarkers for cardiovascular risk, substantiating the value of high-throughput metabolomics for biomarker discovery and improved risk assessment. Another publication in Heart (Tillin et al) identified that 2 widely used cardiovascular risk prediction tools (QRISK2 and Framingham) did not perform consistently well in all ethnic groups and suggested that further validation of QRISK2 in other multi-ethnic datasets, and better methods for identifying high risk African Caribbeans and South Asian women, are required.

In addition to journal publications, UCL will continue to submit abstracts for presentation at national and international conferences, such as Diabetes UK, the European Association for the Study of Diabetes, the European Society of Cardiology, Artery, and the British Hypertension Society. All data for abstracts/presentations will be at aggregate level with suppression of small numbers in line with HES analysis guide.

The study team will further disseminate findings via participant and GP feedback sessions; newsletters, and the study website. All data for these occasions will be at aggregate level with suppression of small numbers in line with HES analysis guide.

At the end of the current funding period (2018) a report will be submitted to the funders (the British Heart Foundation) summarising findings. This may be published on their website. It will only contain at most aggregate level data, with small numbers suppressed in line with HES Analysis Guide.

Processing:

The identifiers of SABRE participants have previously been shared with NHS Digital’s predecessor organisation(s) and NHS Digital has provided regular event notifications including notifications of mortality and cancer registrations (for eligible participants only). The cohort was previously split into two groups: cancer notifiable participants and non-cancer notifiable participants.

The cohort will be reorganised into three groups: participants who gave informed consent (cancer notifiable); cancer notifiable participants covered by section 251 support, and non-cancer notifiable participants covered by section 251 support. To ensure that participants are correctly reorganised into the appropriate groups, UCL will send NHS Digital 3 separate files (one for each respective group) containing participant identifiers.

NHS Digital will then provide reports on a monthly basis while the study is in active follow-up. Notifications will contain no participant identifiers other than unique study Pseudo-IDs. Month and Year of Death will also be included.

NHS Digital will link the respective cohort groups to HES data and will supply to UCL encrypted files containing hospital admissions data identified only by study Pseudo-ID and encrypted HESID and containing no other identifiers. The dataset will be placed immediately into UCL’s Data Safe Haven.

Using the Pseudo-ID, the data is linked at record level to the existing dataset of mortality, clinical measures, primary care record review and participant responses to health and lifestyle questionnaires across the course of the study. The data is stored in an encrypted file within the Data Safe Haven at the Gower Street location. The data can be remotely accessed at the Institute of Cardiovascular Science by accredited SABRE study researchers only – all of whom are substantive employees of UCL. Access must be approved by the Data Manager.

The data supplied by NHS Digital will not be downloaded or otherwise transferred from the Data Safe Haven. Data including variables derived from the NHS Digital data may be downloaded from the Data Safe Haven and stored on a UCL server at the Institute of Cardiovascular Science to be used solely for the purposes of statistical analyses in accordance with the study objectives. Such variables include, for example, date of first admission related to a diagnosis of coronary heart disease but will not include any part of the dataset supplied by NHS Digital. Using this pseudonymised dataset, study analysts will examine associations between risk factors measured during the course of the study and cardiometabolic events. The rich phenotypic and genotypic dataset will enable identification of ethnic differences in cardiometabolic disease risk and physical, mental and cognitive function into older age and it will be possible to identify which measured risk factors may explain ethnic differentials and at which period of life they may act most strongly.

To meet study objectives UCL require information on admissions where diagnostic code lists include coronary heart disease, stroke, heart failure, diabetes, renal failure, dementia, retinopathy, hypertension, other cardiovascular disease. Respiratory diseases will also be studied and mental health disorders and other common disorders may be added which are considered to exert important influences on function and well-being in older age. As an example, from the HES extract, and within the UCL Data Safe Haven, it is expected that a variable will be generated which identifies a first or subsequent admission with coronary heart disease (ICD-9 codes 410 through 415 or ICD-10 codes I200 through I259, or any of the following operation codes from the Office of Populations and Surveys classification of interventions and procedures: K401 through K469, K491 through K504, K751 through K759, or U541 (coronary revascularization interventions or rehabilitation for ischemic heart disease)). Date of first or subsequent event would be summarised as year of event.

The data is stored separately to participant identifiers. The two datasets will not be re-linked and the data will remain pseudonymised as described above. Month and Year of Death are stored in the dataset and used for statistical analyses but the dataset does not include full Date of Death. Participant identifiers are retained separately solely for study administration purposes.


MR472 - SABRE: Southall and Brent Revisited - S251 participants — DARS-NIC-91374-Z5V6Y

Type of data: information not disclosed for TRE projects

Opt outs honoured: Yes - patient objections upheld, Anonymised - ICO Code Compliant, Identifiable, Yes (Section 251 NHS Act 2006)

Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012, Section 251 approval is in place for the flow of identifiable data, National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-04-01 — 2022-03-31 2017.09 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - Cause of Death Report
  2. MRIS - Cohort Event Notification Report
  3. Hospital Episode Statistics Admitted Patient Care
  4. MRIS - Members and Postings Report
  5. Civil Registration - Deaths
  6. Demographics
  7. Cancer Registration Data
  8. MRIS - Flagging Current Status Report
  9. MRIS - List Cleaning Report
  10. Hospital Episode Statistics Admitted Patient Care (HES APC)
  11. Civil Registrations of Death

Objectives:

University College London (UCL) requires notifications of mortality and cancer registrations and linked HES data for its study cohort for use in the Medical Research Project: SABRE (Southall And Brent Revisited. This is a population-based cohort study, conducted at University College London, funded by the British Heart Foundation in its current 25 year follow-up phase. It is unique as a long-standing tri-ethnic cohort consisting of people of European descent and first generation migrants of South Asian or African Caribbean descent. This is an academic research study focusing on identifying and understanding the underlying reasons for ethnic group and sex differences in cardiometabolic disease and in physical, psychological and cognitive function in older age.

Specific questions for the 25 year follow-up study are:
1. How large are ethnic /sex differences in cardiac function, cognitive function and hippocampal volumes in older age?
2. To what extent do cardiac function, cognitive function and hippocampal volumes change over a 5 year period in each ethnic group?
3. Which risk factors measured in mid-life and in early old age are most strongly associated with current cardiac and cognitive function and hippocampal volumes and with 5 year changes in these parameters? Can these risk factors explain ethnic differences in cardiac and cognitive function?
4. How large are gender differences in current disorders of cardiac and cognitive function and in their associations with current risk factors?
5. Do ethnic differences in incident cardiometabolic disorders persist into older age?
6. Which risk factors or risk factor profiles measured in mid-life and early old age are most strongly associated with incident cardiometabolic disorders and which best explain ethnic differences in incidence?

The study receives ongoing notifications of mortality and cancer registrations from NHS Digital. Continuing supply of this data is required in order to meet study objectives. Death and cause of death are key outcomes for the research objectives and cancer registrations are also key to understanding ethnic disparities in development and survival from the most frequent types of cancer and how these impact upon function in older age.

The study has previously utilised the List Cleaning service from time to time when in active follow-up in order to ensure that the correct participant addresses are used in order to contact participants. Use of this service has helped the study to avoid trying to contact deceased participants. The List Cleaning outputs were used to update the administration database (held separately from other data within the UCL data safe haven) so that UCL could write to as many participants as possible inviting them to complete questionnaires or come into the UCL clinic for a detailed investigation. Under this Data Sharing Agreement, UCL may retain List Cleaning outputs received previously but is not permitted to make further use of the List Cleaning service.

Linked HES data is required to identify incident cardiometabolic events (in particular coronary heart disease, heart failure, stroke, dementia, diabetes), and other events which may affect physical and cognitive function, which have occurred during the follow-up period. Details of all hospital episodes involving the cohort (not limited to the previously stated conditions) are required to address key study objectives with regard to physical and cognitive function in older age in association with current and mid-life risk factors. Analysis needs to consider any and all potential contributing factors.

These events will supplement information provided by participant self-report at 20 and 25 years, from primary care medical record review conducted during the 20 year follow-up and from cancer and mortality flagging, together with detailed clinical measurements made at the SABRE clinics at baseline, 20 and 25 year follow-up. The SABRE cohort is increasingly elderly (median age of survivors in 2016=77 years, range 65-98) and at visit 3, although many are willing and able to visit UCL’s clinic and/or to complete questionnaires, many who attended at the last follow-up 5 years ago are now too frail or unwell to attend the 25 year follow-up clinic or to complete the health and lifestyle questionnaires, and sadly many have died (approximately 1,500 (31%)). Diagnosis of disease events/states identified during admission to hospital is increasingly important in assessing health in this elderly cohort and will inform all key event outcomes. This is particularly important in assessing health in those otherwise lost to direct follow-up. The data will be used to analyse risk factors measured in mid- and later life in association with these incident events in order to build on current understanding of causal mechanisms.

Data from 1989 to the present is required because participants underwent detailed examinations at baseline (1989-91) and the aim is to follow this cohort through their experiences since to understand what happened in later life and relate that to the baseline. This will enable UCL to gain as complete as possible a picture of hospital admissions, and hence incident events, over the entire cohort follow-up. Data from the entire study period are crucial for determining age of onset of events, as well as the extent and nature of ill-health from mid to later life, and for relating these to current and mid-life cardiometabolic and other risk factors and how these influence the key study outcomes of physical and cognitive function in older life in each of the three ethnic groups.

Yielded Benefits:

The research has also enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). Findings include: - Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity. - Lack of adherence to four combined health behaviours was associated with a 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours. - The study has also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. - The study reported detrimental associations between air pollution (particulate measures) and cardiovascular disease mortality in both the SABRE and NHSD cohorts. The study highlighted ethnic differences in associations between prediabetes in midlife and later development of coronary heart disease and stroke. - The study has confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Expected Benefits:

The rich phenotypic and genotypic dataset gathered over a 25 year period will enable analyses assessing mid-life predictors of health and ill-health in older age and will enable unique analyses of how these associations may be related to ethnicity and migration. Good physical and cognitive functions are vital to healthy ageing and factors which influence these across the life course are poorly understood, particularly in non-European populations. As the cohort is reaching older age, an increase in risk of heart failure, which can be severely debilitating, is expected. Ethnic differentials in heart failure rates are not well studied to date. Increasing length of follow-up and novel analytic techniques, both statistical and relating to stored images and samples bring opportunities for more sophisticated analyses and the addition of hospital admission data to key outcome variables enhances the study’s power to identify events and to further elucidate mechanisms underlying the very marked ethnic differences in cardiometabolic disorders which were observed at visit 2.

Understanding of mechanisms in people of different ethnicities will ultimately lead to appropriate preventive strategies and treatments at different stages of life.

As noted with some detailed examples under ‘specific outputs’, previous use of HES data (1989-2011) enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). A brief summary of some of these findings in the cohort to 2011 follows:

Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity.

Lack of adherence to four combined health behaviours was associated with 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours.

The study also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. This is not an exhaustive list of study findings in relation to incident cardiometabolic disease but indicates that the study is building a steady accumulation of understanding of ethnic differentials. There is clear need for further study, which this cohort is uniquely able to address. The addition of HES data to 2016 is key to maximising event ascertainment in old age.

Outputs:

Study findings will continue to be published in peer-reviewed scientific journals, predominantly related to epidemiology, cardiovascular and metabolic disorders, cognitive, physical and psychological function, but also including more generic journals such as the BMJ, reflecting the increasing focus on overall health in older age. Publications will contain only aggregate level data without local identifiers and with suppression of small numbers in line with HES analysis guide.

Publications to date are listed on the study website: www.sabrestudy.org. All publications since 2008 are open-access. The audience is expected to consist mainly of academic researchers and clinicians.

Two examples of previous SABRE study related publications are listed below, both sets of analyses were importantly informed by data from a previous HES extract (no longer retained), and were published in high-impact factor peer-reviewed journals. These generated considerable media interest and are widely cited.

Tillin T, Hughes AD, Mayet J, Whincup P, Sattar N, Forouhi NG, McKeigue PM, Chaturvedi N. The relationship between metabolic risk factors and incident cardiovascular disease in Europeans, South Asians and African Caribbeans. SABRE (Southall and Brent revisited) – a prospective population based study. J Am Coll Cardiol. 2013 Apr 30;61(17):1777-86. http://dx.doi.org/10.1016/j.jacc.2012.12.046. This paper published in JACC, the world no 1 cardiology journal (impact factor 16.5) confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data, although not directly reported in the manuscript, contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Tillin T, Hughes AD, Godsland IF, Whincup P, Forouhi NG, Welsh P, Sattar N, McKeigue PM, Chaturvedi N. Insulin resistance and truncal obesity as important determinants of the greater incidence of diabetes in Indian Asians and African Caribbeans compared to Europeans? The Southall And Brent REvisited (SABRE) cohort. Diabetes Care 2013;36(2)(383-393). http://care.diabetesjournals.org/content/36/2/383.long. This paper demonstrated the extraordinarily high risk of incident diabetes continuing into old age in South Asians and African Caribbeans in comparison with Europeans. Metabolic pathways leading to diabetes remain poorly understood. The study found that baseline insulin resistance and truncal obesity could explain the ethnic differences in women but not in men. Further work continues to determine the reasons for the excess risk in men and to understand what underlies insulin resistance and truncal obesity. HES data, although not directly reported in this manuscript, supported these analyses by enabling sensitivity analyses to assess the effects of bias due to loss to follow-up.

A further 8 journal publications have examined associations between baseline risk factors and incident coronary heart disease or stroke where the outcomes were a composite of first events identified through participant reported events, primary care record review identified events and HES identified hospital admissions. One of these was published in Circulation (Wurtz et al), impact factor 14.3 and demonstrated in 3 separate population based studies (including SABRE) that metabolite profiling in large prospective cohorts identified phenylalanine, monounsaturated fatty acids, and polyunsaturated fatty acids as biomarkers for cardiovascular risk, substantiating the value of high-throughput metabolomics for biomarker discovery and improved risk assessment. Another publication in Heart (Tillin et al) identified that 2 widely used cardiovascular risk prediction tools (QRISK2 and Framingham) did not perform consistently well in all ethnic groups and suggested that further validation of QRISK2 in other multi-ethnic datasets, and better methods for identifying high risk African Caribbeans and South Asian women, are required.

In addition to journal publications, UCL will continue to submit abstracts for presentation at national and international conferences, such as Diabetes UK, the European Association for the Study of Diabetes, the European Society of Cardiology, Artery, and the British Hypertension Society. All data for abstracts/presentations will be at aggregate level with suppression of small numbers in line with HES analysis guide.

The study team will further disseminate findings via participant and GP feedback sessions; newsletters, and the study website. All data for these occasions will be at aggregate level with suppression of small numbers in line with HES analysis guide.

At the end of the current funding period (2018) a report will be submitted to the funders (the British Heart Foundation) summarising findings. This may be published on their website. It will only contain at most aggregate level data, with small numbers suppressed in line with HES Analysis Guide.

Processing:

The identifiers of SABRE participants have previously been shared with NHS Digital’s predecessor organisation(s) and NHS Digital has provided regular event notifications including notifications of mortality and cancer registrations. The cohort was previously split into two groups: cancer notifiable participants and non-cancer notifiable participants.

The cohort will be reorganised into three groups: participants who gave informed consent (cancer notifiable); cancer notifiable participants covered by section 251 support, and non-cancer notifiable participants covered by section 251 support. To ensure that participants are correctly reorganised into the appropriate groups, UCL will send NHS Digital 3 separate files (one for each respective group) containing participant identifiers.

NHS Digital will then provide reports on a monthly basis while the study is in active follow-up. Notifications will contain no participant identifiers other than unique study Pseudo-IDs. Month and Year of Death will also be included.

NHS Digital will link the respective cohort groups to HES data and will supply to UCL encrypted files containing hospital admissions data identified only by study Pseudo-ID and encrypted HESID and containing no other identifiers. The dataset will be placed immediately into UCL’s Data Safe Haven.

Using the Pseudo-ID, the data is linked at record level to the existing dataset of mortality and cancer records, clinical measures, primary care record review and participant responses to health and lifestyle questionnaires across the course of the study. The data is stored in an encrypted file within the Data Safe Haven at the Gower Street location. The data can be remotely accessed at the Institute of Cardiovascular Science by accredited SABRE study researchers only – all of whom are substantive employees of UCL. Access must be approved by the Data Manager.

The data supplied by NHS Digital will not be downloaded or otherwise transferred from the Data Safe Haven. Data including variables derived from the NHS Digital data may be downloaded from the Data Safe Haven and stored on a UCL server at the Institute of Cardiovascular Science to be used solely for the purposes of statistical analyses in accordance with the study objectives. Such variables include, for example, date of first admission related to a diagnosis of coronary heart disease but will not include any part of the dataset supplied by NHS Digital. Using this pseudonymised dataset, study analysts will examine associations between risk factors measured during the course of the study and cardiometabolic events. The rich phenotypic and genotypic dataset will enable identification of ethnic differences in cardiometabolic disease risk and physical, mental and cognitive function into older age and it will be possible to identify which measured risk factors may explain ethnic differentials and at which period of life they may act most strongly.

To meet study objectives UCL require information on admissions where diagnostic code lists include coronary heart disease, stroke, heart failure, diabetes, renal failure, dementia, retinopathy, hypertension, other cardiovascular disease. Respiratory diseases will also be studied and mental health disorders and other common disorders may be added which are considered to exert important influences on function and well-being in older age. As an example, from the HES extract, and within the UCL Data Safe Haven, it is expected that a variable will be generated which identifies a first or subsequent admission with coronary heart disease (ICD-9 codes 410 through 415 or ICD-10 codes I200 through I259, or any of the following operation codes from the Office of Populations and Surveys classification of interventions and procedures: K401 through K469, K491 through K504, K751 through K759, or U541 (coronary revascularization interventions or rehabilitation for ischemic heart disease)). Date of first or subsequent event would be summarised as year of event.

The data is stored separately to participant identifiers. The two datasets will not be re-linked and the data will remain pseudonymised as described above. Month and Year of Death are stored in the dataset and used for statistical analyses but the dataset does not include full Date of Death. Participant identifiers are retained separately solely for study administration purposes.


Whitehall II (MR262) — DARS-NIC-346693-F2X1G

Type of data: information not disclosed for TRE projects

Opt outs honoured: Y, N, Yes - patient objections upheld, No - data flow is not identifiable, Anonymised - ICO Code Compliant, Identifiable, Yes, No (Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2018-06-14 — 2021-06-13 2017.09 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Mental Health and Learning Disabilities Data Set
  2. Bridge file: Hospital Episode Statistics to Diagnostic Imaging Dataset
  3. Diagnostic Imaging Dataset
  4. Hospital Episode Statistics Admitted Patient Care
  5. MRIS - Cause of Death Report
  6. MRIS - Cohort Event Notification Report
  7. MRIS - Scottish NHS / Registration
  8. Hospital Episode Statistics Accident and Emergency
  9. Mental Health Services Data Set
  10. Civil Registration - Deaths
  11. Demographics
  12. Cancer Registration Data
  13. Hospital Episode Statistics Outpatients
  14. Mental Health Minimum Data Set
  15. MRIS - Members and Postings Report
  16. Diagnostic Imaging Data Set (DID)
  17. Hospital Episode Statistics Accident and Emergency (HES A and E)
  18. Hospital Episode Statistics Admitted Patient Care (HES APC)
  19. Hospital Episode Statistics Outpatients (HES OP)
  20. Mental Health and Learning Disabilities Data Set (MHLDDS)
  21. Mental Health Minimum Data Set (MHMDS)
  22. Mental Health Services Data Set (MHSDS)
  23. Civil Registrations of Death
  24. COVID-19 SGSS First Positives (Second Generation Surveillance System)
  25. COVID-19 Vaccination Status
  26. Emergency Care Data Set (ECDS)
  27. HES-ID to MPS-ID HES Accident and Emergency
  28. HES-ID to MPS-ID HES Admitted Patient Care

Objectives:

The Whitehall II study was setup in 1985 as a prospective cohort project to explore the relationship between socio-economic status, stress and cardiovascular disease. The study, based at University College London (UCL), recruited civil servants working in London. The participants were sent a self-completion questionnaire covering a wide range of topics, and underwent a comprehensive clinical examination.

Since 1985 there have been eleven phases of data collection of similar nature. These data have always been collected on the original cohort recruited in 1985, and no additional recruitment of participants has taken place since then. In addition to cardiovascular measures, the Whitehall II study have over the years added further measures to test physical functioning, cognitive functioning, mental health, measures of cortisol levels and new cardiovascular tests such as Heart Rate Variability (HRV) and Pulse Wave Velocity (PWV).

There are three distinct aspects to the study:
1) The compilation of research data, which consists of the collection of self-completion questionnaires and medical examination data from the Whitehall II cohort participants. Medical data and mortality data from this cohort are also obtained through data linkage with external data sources such as NHS Digital and ONS. The totality of these data are compiled into the Whitehall II research database for use as a research resource;

2) Public health research studies undertaken within the scope of the Whitehall II study, which aim to answer specific questions and are primarily funded by grants from the Medical Research Council and the British Heart Foundation. Further studies have been funded by the European Commission Horizon 2020 and the Economic and Social Research Council. No raw or record level NHS Digital data is shared with funders.

3) Making pseudonymised data available to the scientific community for use in UCL-approved research studies beyond the scope of the Whitehall II study. This will not include NHS Digital data. Any data supplied to third parties, whether as part of the EU-funded LIFEPATH project or for any other purpose will comprise of:
a. Self-reported data provided voluntarily by the participants; and/or
b. Variables derived from the ONS mortality data. Specifically ‘yes’ or ‘no’ indicators to indicate if the participant is deceased and, if so, if specific causes of death were applicable or not; and/or
c. Verified self-reported clinical events in the form of ‘yes’ or ‘no’ variables to indicate if the participant is has had a specific clinical event such as a stroke, cancer or CHD episode.

Regarding the last item, HES, Mental Health and Cancer registration data are used solely for the purpose of verifying self-reported data and are not included in any datasets shared with third parties. As an example, if a participant self-reported a stroke, the applicant would cross-check the data with the HES data to verify the diagnosis. If verified, the research data that could potentially be made available to third parties would include a ‘yes’/’no’ n indicator confirming the self-reported stroke.

The data will be used for public health research purposes. Based on 30 years of follow-up, the aim is to examine the interrelationships between biological, psychosocial and behavioural factors in the ageing process, and identify key determinants of late life depression, cognitive decline, chronic disease, and physical functioning. The study’s healthy ageing cohort is an ideal platform for studying primary prevention of vascular disease (CHD, stroke) and diabetes. The cohort is now aged 62-84 years and is measured for age-related physical and cognitive functioning and mental health. The study contributes to the evidence on the potential for preventing functional decline through therapeutic risk factor reduction and behavioural interventions.

Self-reported clinical events data are open to major limitations of bias, including missing responses and attrition. Therefore, since 1997, UCL have supplemented the self-reported events with information extracted from GP and paper hospital notes, and also with data provided by NHS Digital.

The Whitehall II MRC grant (K013351, Adult Determinants of Late Life Depression, Cognitive Decline and Physical Functioning - The Whitehall II Study of Ageing) has dementia, disability and depression as the outcome variables. In order to be able to study these outcomes in older individuals it is crucial to have complete data from all possible sources. Data on psychiatric conditions are important outcomes in their own right, but it is also needed to study other conditions. For example, the diagnosis of dementia involves ruling out major psychiatric disorder as an underlying condition for the observed clinical phenotype. In order to do external data on psychiatric conditions is required. This cannot possibly be achieved without access to the Mental Health and Learning Disabilities Data Set (MHLDDS). In addition, information on clinical procedures, such as brain MRI or CT, is important to evaluate the validity of dementia diagnosis and changes in diagnostic testing over time, a potential source of bias that needs to be considered longitudinal analyses. For this reason, records from the Digital Imaging Dataset are also needed for the Whitehall II dementia project.

Yielded Benefits:

The key benefit to the public/patients is that linkage to English and Welsh records is helping expand the knowledge base to which the Whitehall II study has already contributed. UCL’s analyses will continue to generate evidence to improve public health policies, clinical guidelines, health care professionals, workplaces and promote healthier lifestyles in the general public for the benefit of patients and the healthcare system. Evidence from previous benefits: Whitehall II have contributed evidence to current clinical guidelines, such as the ‘European Guidelines on Cardiovascular Prevention in Clinical Practice’ (see Eur Heart J 2012;33:1635-1701 and Eur Heart J 2016;37:2315-2381) and the ‘Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association’ (see Stroke. 2014;45:2160-2236). UCL have used Whitehall II data in their state-of-the-art reviews on prediabetes (Tabak A,Kivimaki M. Lancet 2013; 379:2279-2290) and stress (Steptoe A, Kivimaki, M. Nature Reviews Cardiology 2012;9(6):360-70 and Steptoe A, Kivimaki, M . Annu Rev Public Health 2013;34:337-54.) to inform health professionals and policy makers in the UK and elsewhere. The Whitehall II study has contributed evidence to the World Health Organization (WHO) policy documents for reducing social inequalities in heath globally (Commission of Social Determinants in Health 2008) and the European and the UK reviews of inequalities and working conditions (Review of Social Determinants and the Health Divide in the WHO European Region, updated in 2014 and Fair Society, Healthy Lives, 2010) and developed a guide for evidence-based public health in a project led by the UK National Institute of Clinical Excellence (NICE, Killoran 2009). In addition, the study has provided evidence to European Union Occupational Safety and Health recommendations and contributed to priority settings in occupational health research at a European level (https://osha.europa.eu/en/tools-andpublications/publications/e-facts/efact18/view; https://osha.europa.eu/en/tools-andpublications/publications/reports/management-psychosocial-risks-esener;https://osha.europa.eu/en/tools andpublications/publications/reports/summary-priorities-for-osh-research-in-eu-for-2013-20). Furthermore, the Whitehall II study has contributed evidence to the American Heart Association prevention policy, which in turn influences UK policy (American Heart Association Behavior Change Committee of the Council on Epidemiology and Prevention, Council on Lifestyle and Cardiometabolic Health, Council for High Blood Pressure Research, and Council on Cardiovascular and Stroke Nursing. ‘Better population Health through behaviour change in adults: a call to action’ Circulation. 2013 Nov5;128(19):2169-76). The paper on long working hours and stroke (Lancet. 2015 Oct 31;386(10005):1739-46) was referenced by WHO (Preventing disease through healthy environments: a global assessment of the burden of disease from environmental risks. World Health Organization. http://www.who.int/iris/handle/10665/204585) and received widespread media coverage (rated the 12th in the 100 in the world Altmetric ratings). Two Whitehall papers contributed evidence to NICE guidelines on Dementia (NG16) published in October 2015. The papers were referenced in “Dementia, disability and frailty in later life – mid-life approaches to delay or prevent onset”. (Sabia S, Singh‑Manoux A, Hagger‑Johnson G et al. (2012) Influence of individual and combined healthy behaviours on successful aging. Canadian Medical Association Journal doi: 10.1503/cmaj.121080; Singh‑Manoux A, Marmot MG, Glymour M et al. (2011) Does cognitive reserve shape cognitive decline? Annals of Neurology 70: 296–304) More recently, a Whitehall II paper on alcohol consumption and cognitive decline ‘Moderate alcohol consumption as risk factor for adverse brain outcomes and cognitive decline: longitudinal cohort study’ (BMJ 2017;357:j2353) received widespread media coverage, demonstrating the neurotoxicity of alcohol consumption (the paper was rated in the top 5% of all outputs scored by Altmetric).

Expected Benefits:

UCL’s analyses will continue to generate evidence to improve public health policies, clinical guidelines, health care professionals, workplaces and promote healthier lifestyles in the general public for the benefit of patients and the healthcare system.

Evidence from previous benefits:
Whitehall II have contributed evidence to current clinical guidelines, such as the “European Guidelines on Cardiovascular Prevention in Clinical Practice” (see Eur Heart J 2012;33:1635-1701) and the “Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association” (see Stroke. 2014;45:2160-2236). UCL have used Whitehall II data in their state-of-the-art reviews on prediabetes (Tabak A,…Kivimaki M. Lancet 2013; 379:2279–2290) and stress (Steptoe A, Kivimäki M. Nature Reviews Cardiology 2012;9(6):360-70 and Steptoe A, Kivimäki M. Annu Rev Public Health. 2013;34:337-54.) to inform health professionals and policy makers in the UK and elsewhere.

UCL have contributed evidence to the World Health Organization (WHO) policy documents for reducing social inequalities in heath globally (Commission of Social Determinants in Health 2008) and the European and the UK reviews of inequalities and working conditions (Review of Social Determinants and the Health Divide in the WHO European Region, updated in 2014 and Fair Society, Healthy Lives, 2010) and developed a guide for evidence-based public health in a project led by the UK National Institute of Clinical Excellence (NICE, Killoran 2009). In addition, UCL have provided evidence to European Union Occupational Safety and Health recommendations and contributed to priority settings in occupational health research at a European level (https://osha.europa.eu/en/tools-and-publications/publications/e-facts/efact18/view; https://osha.europa.eu/en/tools-and-publications/publications/reports/management-psychosocial-risks-esener; https://osha.europa.eu/en/tools-and-publications/publications/reports/summary-priorities-for-osh-research-in-eu-for-2013-20).

In addition, UCL have contributed evidence to the American Heart Association prevention policy, which in turn influences UK policy (American Heart Association Behavior Change Committee of the Council on Epidemiology and Prevention, Council on Lifestyle and Cardiometabolic Health, Council for High Blood Pressure Research, and Council on Cardiovascular and Stroke Nursing. “Better population Health through behaviour change in adults: a call to action”. Circulation. 2013 Nov5;128(19):2169-76)

Whitehall II findings on modifiable protective factors and risk factors have generated wide interest in media in the UK and worldwide. For example, the paper on alcohol consumption and cognitive ageing was ranked Altmetric Top 100 in the world in 2014 (http://www.altmetric.com/top100/2014/).

Outputs:

All published outputs will be aggregate with small number supressed in line with the HES analysis and mental health guides.

The Whitehall II researchers will use peer-review journals to report the contribution of midlife inflammatory, vascular, and metabolic factors to chronic disease, depression, cognitive impairment and functional health in later life. They will also assess whether the adoption of healthy lifestyle even at older ages modifies functional trajectories, and aim to develop multi-factorial predictive algorithms, like those developed for cardiovascular diseases, to facilitate early identification of adverse ageing outcomes.

The study dissemination plan, which has been very successful up to now (please see examples below), involves publications in high impact scientific journals, scientific meetings, briefing papers for policy makers, regular and ad hoc meetings with interested parties such as the UK Health Forum.

A research dataset will be created for the UCL study researchers named in the Data Sharing Agreement. It will contain all records with all directly identifiable data removed. It will include the study ID but no personal variables. Any sensitive variables that might identify a participant (such as hospital dates or full ICD-10 codes) will never be published, reported or provided to third parties.

The scientific conclusions of the Whitehall II study will be published in international peer-reviewed journals starting from a few months after the data are available. UCL aim to continue publishing the analyses in journals with high coverage and high impact factor and UCL’s preference is journals with an open access option (web version of the paper available free of charge). Some examples of journals where the Whitehall II researchers have published their results in 2015 are PLoS One, American Journal of Medicine, Stroke, European Heart Journal, Neurology, Nature, Lancet, Epidemiology, British Medical Journal, etc.

A full list of the project publications to date is published on the Whitehall II website (https://www.ucl.ac.uk/whitehallII/publications).

Processing:

Only substantive employees of UCL will have access to the data and only for the purposes described in this document.

UCL will process the ONS data in accordance with the standard ONS terms and conditions.

The Whitehall II study at UCL currently holds sensitive and identifiable data from several sources, all linked to the cohort. These are the Personal Demographics Service (PDS), Cancer Registrations, ONS Mortality, HES admitted patient care, HES outpatient, MHLDDS and HES Accident and Emergency.

UCL have already supplied NHS number, date of birth, and gender to the NHS Digital for linkage.

Linking with electronic health records is at the core of the project, as they provide the objective health outcomes needed for our project. These data will be used by the researchers using a variety of statistical methods to fulfil the study aims described above.

All personal information about the study participants is treated in the strictest confidence in accordance with the Data Protection Act (1998) and the NHS Information Governance requirements. As described in the study NHS IG Toolkit, the study safeguards and security policies ensure appropriate use of all personal information collected. Personal data about study participants (e.g. name, NHS number, contact details, date of birth, GP details, etc) are stored securely on the UCL secure computer network managed by the UCL School of Life and Medical Sciences (SLMS). These data are handled by the Whitehall II administrative and data management personnel and are used only to contact participants.

Clinical information about participants provided by external sources such as from NHS Digital are also stored separately on this secure UCL SLMS area.

Whitehall II researchers do not have direct access to the identifiable records in neither paper nor electronic form. No identifiable personal data will ever be published.

The clinical, questionnaire and medical data collected by the study (including HES, mental health, digital imaging and ONS data) are used for research purposes only. These data are pseudonymised before being moved from the secure area to the research area on the UCL SMLS network. Pseudonymisation is achieved by assigning each participant a unique identifier and by removing all personal information (e.g. name, NHS number, contact details, date of birth, GP details, etc) before the data are added to the database used by the researchers.

A data sharing policy is in place to make the pseudonymised research data available to the scientific community this refers to the trial data and not data supplied by NHS Digital.

All collaborators must be bona-fide scientists with an established record, who will conduct high quality, ethical research. The research files provided to these external collaborators are tailored to their project and are securely transferred for their use only. Any data supplied to third parties, will comprise of:
• Self-reported data provided voluntarily by the participants; and/or
• Aive or dead status flag derived from the ONS mortality data. Specifically ‘yes’ or ‘no’ indicators to indicate if the participant is deceased and, if so, if specific causes of death were applicable or not; and/or
• Verified self-reported clinical events in the form of ‘yes’ or ‘no’ variables to indicate if the participant is has had a specific clinical event such as a stroke, cancer or CHD episode.

Regarding the last item, HES, Mental Health, DID and Cancer registration data are used solely for the purpose of verifying self-reported medical events and are not included in any datasets shared with third parties. As an example, if a participant self-reported a stroke, the applicant would cross-check the data with the HES data to verify the diagnosis. If verified, the research data that could potentially be made available to third parties would include an indicator confirming the self-reported stroke.

Funding arrangements, both UK and non-UK funding, will not include sharing NHS Digital record level data with these funders or permit them to influence the results or dissemination of results.


UK Early Life Cohort Feasibility Study (ELC-FS) — DARS-NIC-482185-K8G0F

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable, Yes (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d), National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-05-08 — 2024-04-30 2023.05 — 2024.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Birth Notification Data
  2. Civil Registration - Births
  3. Demographics

Objectives:

University College London requires access to NHS England data for the purpose of the following research project: The Early Life Cohort Feasibility Study (ELC-FS).

The following is a summary of the aims of the research project provided by University College London:

“The ELC-FS will test proof of concept for a new national birth cohort study for the UK. It will collect rich data on babies born across the UK during two consecutive months of 2022 or 2023, capturing the economic and social environments into which these babies are born, and their health, well-being and development in their first 6-10 months. The study will provide data of substantive value in itself, providing vital evidence on new lives across the UK at a critical time, particularly with regards to the shock to health and the economy induced by the COVID-19 pandemic, as well as the as yet unknown impacts of Brexit on our economy and society. It will highlight major sources of early developmental inequalities and family stressors, and identify potential foci for early intervention and support.

“NHS England data will be used for the recruitment and sampling elements of the study, including sampling participants in England and Wales for the study from birth registrations linked to birth notifications and using an opt-out approach to taking part in the study following first contact with the sample.

“The primary scientific aim of the study will be to understand how variability, and particularly inequalities, in the central domains of early child development emerge over time and to determine the social and biological factors influencing their trajectories. The study will use the ethnicity fields to boost the England sample. Although for this study it is unlikely to be practicable to include boost samples based on the other fields, securing access to these variables is important to demonstrating the future feasibility of this, including for Wales (and Scotland and Northern Ireland under separate applications to relevant controllers). The study also proposes to use additional fields for targeted fieldwork approaches to maximise engagement and inclusivity. Access to the individual-level characteristics by the Centre for Longitudinal Studies (CLS) also provides possibilities for wider use of records for methodological research, including non-response analysis to assess population representivity, weighting and adjustment.

“NHS England will draw a sample of registrations of births linked to NHS birth notifications for the two selected birth months. The initial source of the sampling fields would be (i) the mother who has registered a birth (ii) informant, where father/other parent, is not at same address as mother. Before contacting the sample, NHS England would provide the study with up-to-date address details, embarkations and death notifications from its Personal Demographics Service. The study requires access to names and addresses including baby name and address (from birth notifications), mother name and address (from birth registrations), father/other parent name and address (from birth registrations).

“For assessing feasibility of using the birth registrations linked to NHS birth notifications as a sample frame, for over-sampling on ethnicity, for targeted fieldwork materials and also for methodological and research purposes the study also requires additional fields from the birth notification and birth registration records. These include fields such as age of mother, multiple birth, birthweight, ethnicity baby, ethnicity mother, gestational age from birth notification records; and fields such as age of mother, multiple birth, birth weight, socioeconomic status, occupation, registration type (sole/joint), name and address mother, name and address father (if registered), country of birth mother, country of birth father (if registered), previous births from birth registration records. Variables which are in both the birth notifications and birth registrations are requested for verification/validation purposes and to fill in missing data.

“The study team also require de-identified sampling fields for all patients who are sampled for the study for non-response analysis and adjustment. Non-response analysis, carried out on data from the whole sample, forms a vital part of the feasibility study since it is needed for the project to understand response rates among different population groups, what the biases are in who decides to take part, and whether or not the sample achieved is sufficiently representative of the national population. This analysis will be used for non-response adjustment (in particular to generate non-response weights, as well as other statistical methods). These weights and guidance on non-response adjustment will be provided to data users and are essential to the study in order that any substantive scientific analyses that are undertaken using the feasibility study data (and in due course, the main study) can be adjusted for any to biases due to selection into the sample, and to ensure the study findings are robust and valid.”


Phase I: Data for Sampling

The following NHS England data will be accessed:
• Birth Notifications – necessary because this dataset will be used to identify all births in the selected two-month period and will contain the dates of birth and each baby’s ethnicity;
• Civil Registration Births – necessary because this dataset will contain the mother’s postcode which will be converted to Lower Super Output Area (LSOA)

The level of the data will be:
• Pseudonymised

The data will be minimised as follows:
• Limited to all babies born in England and Wales during two consecutive months in either 2022 or 2023;
• Limited to four non-identifying variables:
o Unique person ID
o Baby’s ethnicity
o Baby’s month of birth
o Lower Super Output Area of mother’s place of residence


Phase II: Data for Fieldwork and Recruitment

The following NHS England data will be accessed:
• Birth Notifications – necessary because this dataset provides universal coverage of the population of babies, contains key characteristics of the baby, mother and father, including where they live, and may allow own-household fathers (OHFs, defined as fathers resident at a different address to the baby at the time of the interview) to be recruited in their own right;
• Civil Registration Births – necessary because this dataset contains additional variables which could be used for sampling, including the ethnic group of the baby and because it facilitates timely access to updated addresses for any post-birth moves and provides notifications of early infant deaths, as well as providing possibilities for wider use of health records for substantive and methodological research.
• Demographics – necessary to facilitate timely access for address updates, any post-birth moves and provides notifications of deaths.

The level of the data will be:
• Identifiable – necessary in order to be able to make contact with the selected sample to let them know they have been chosen to take part in a study and to give them the option to opt out.

The data will be minimised as follows:
• Limited to approximately 2,970 ‘families’ selected for recruitment by Ipsos and meeting the inclusion criteria (~2,376 ‘families’ in England and ~594 in Wales) – ‘families’ in this context is defined as the baby, their mother and their father or other parent;
• NHS England will screen for deaths and/or possible adoption cases and will apply National Data Opt-Outs and ‘families’ will be excluded from the data disseminated if:
o either the baby or mother are identified as deceased;
o either the baby or mother are identified as having de-registered from the NHS for the reason of moving abroad ('embarkation');
o the baby’s record is marked as sensitive or is not traced;
o any of the baby, mother or father has registered a National Data Opt-Out

Sample sizes have been calculated taking into account estimated recruitment rates and to ensure a representative sample. The fields being requested from the data sources are those that are necessary to draw the sample and contact people, to be able to assess the feasibility of using them in the sampling frame, to oversample on particular characteristics such as ethnicity, for targeted engagement (e.g. including leaflets aimed at teen mums), responsive design (e.g. during fieldwork checking to see if certain groups are not taking part and putting more effort into recruiting from those groups) and for important methodological purposes, including non-response analysis to assess population representivity, weighting and adjustment.

University College London is the research sponsor and the controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients’ treatments or care.

The funding is provided by the Economic and Social Research Council (ESRC). The funding is specifically for the feasibility study described. Funding is in place until June 2024.

Ipsos is a processor acting under the instructions of University College London. Ipsos is a fieldwork agency which will undertake the sampling and recruitment tasks.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. UCL stores data on the Cloud provided by Amazon Web Services.

The project is led by three Co-Directors employed by UCL. These Co-Directors are supported by a team of Co-Investigators, including experts in each of the four nations of the UK, including from UCL, Swansea University, University of Edinburgh, University of Ulster, and the Fatherhood Institute.

A number of highly experienced senior collaborators are supporting the project by providing input into the study design from Bryson Purdon Social Research, Public Health Scotland, University of Edinburgh, University of Belfast, Manchester Metropolitan University, ScotCen. These senior collaborators are not Controllers or Processors for the study. These contributions are all funded by an ESRC grant.

The study also has project partners who are providing non-financial support to the project, not directly funded. Project partners also bringing extensive networks and experience to the project are the Nuffield Family Justice Observatory, First 1001 Days Movement, and the National Children’s Bureau (NCB).

Public and Patient Involvement and Engagement helped refine the purpose of the research.

During September-November 2021, two waves of public dialogue workshops for the ELC-FS were hosted by Kantar Public, with the approach carefully modified in each UK nation to reflect the different legal frameworks in place across the UK. These involved 62 participants, all parents with young children across five locations (two in England and one each in Wales, Scotland and Northern Ireland). Kantar explored the attitudes of parents of young children to the proposed uses of administrative data in the ELC-FS through the public dialogue workshops. This included exploring views about the use of linked birth registration data and NHS maternity records as the sample frame for this project, and the proposed recruitment process. Overall, parents were accepting and supportive of the proposed uses of identifiable administrative data and recruitment processes proposed, as they saw them as necessary to draw the sample and to ensure sufficient representativeness of the study. The public engagement suggested that parents of young children understood the rationale and public benefit for this proposed use of their data and its importance to building an inclusive cohort.

UCL has used existing parent and young person advisory groups, recruited from across England and Northern Ireland, run by NCB as representatives of potential study participants. The Young Research Advisors (YRAs) are a diverse group of approximately 40 children and young people aged from 7-18. The Family Research Advisory Group (FRAG) comprises approximately 25 parents and carers, some of whom are parents of young people with additional support needs. Workshops have been conducted with each of these groups focusing on co-production of the scientific content, ethics and participant engagement. In addition to work with the NCB, UCL commissioned Ipsos to conduct qualitative research with own household fathers (OHFs) and low-income families to understand how best to engage these groups in the ELC-FS. In both the NCB and qualitative research projects, the participants had useful suggestions about how to build trust with study participants through our communication strategies and interviewer training.

Outputs:

The expected outputs of the processing will be:
• A Research resource deposited with the UK Data Archive late 2024, available to bona fide researchers for the purposes of statistical data analysis comprised of information collected from the consented participants. The information provided by NHS England will have facilitated the collection of this information. The deposited data will not include the sampling field data provided by NHS England.
• A Number of recommendations for funders and stakeholders around the design for a future main early life cohort study. These will include a set of high-quality, open-access outputs (including reports, working papers and journal articles) to enable a thorough assessment of the feasibility of the main ELC and to inform its design and implementation. Outputs from the feasibility study will include an evaluation of recruitment rates and biases, and of data quality (including item non-response), the suitability and scalability of data collection innovations that have been tested, and an evaluation of experimental components (targeted incentives and bio-samples).
• An assessment of the quality of data fields on the achieved sample frames, and their suitability for use for over-sampling or targeted recruitment strategies, a report on record linkages (lessons learned), recommended next steps in developing a national study of ‘children in need’, reports from the public dialogue, parent and young person groups, including qualitative work with fathers.
• A set of design protocols for the main study including for the sample design, participant contact, and scientific content (including data collection instruments, bio-samples and record linkages) and a public engagement policy.

The outputs will not contain NHS England data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
• Reports to funders and stakeholders;
• The CLS website https://cls.ucl.ac.uk/cls-studies/
• A participant facing ELC-FS website with its own branding (to be commissioned).

The target date for production and dissemination of the above outputs is 2023/2024.

Processing:

Phase I: Data for Sampling

NHS England data will provide the relevant records from the Civil Registration Births and Birth Notifications datasets to Ipsos. The data will contain no direct identifying data items but will contain a unique person ID which can be used by NHS England to link the data with other record level data it holds.

The data will be stored on servers at Ipsos. Access is restricted to employees or agents of Ipsos.

Analysts from Ipsos will process the data for the purpose of selecting a random sample of families selected for recruitment to the study.

Ipsos will transfer data to NHS England. The data will consist of the unique person ID (as originally supplied by NHSE) for the ~2,970 babies selected for recruitment.

NHSE will extract the relevant data from the Civil Registration Births and Birth Notifications datasets for each baby, their mother and, where known, their father. NHSE will match the details of each ‘family’ member with the latest Demographics data to identify any deaths, embarkations or no traces. and will apply National Data Opt-Outs to the output. NHSE will then remove any ‘family’ where any member has applied the National Data Opt-Out or where either the baby or mother is deceased or where the baby is not traced.


Phase II: Data for Fieldwork and Recruitment

NHS England data will provide the relevant records from the Civil Registration Births, Birth Notifications and Demographics datasets to Ipsos. The data will contain directly identifying data items including Names, NHS Number, Date of Birth, Address and Postcode which are required to facilitate contact with the parents and also to enable Ipsos to screen for deaths and obtain latest address information from NHSE ahead of each contact attempt.

The data will be stored on servers at Ipsos.

Ipsos fieldworkers will use the name and contact details to send an initial letter to the relevant mothers and fathers giving them information about the study and giving an opportunity for respondents to remove themselves from the sample. Additionally, analysts from Ipsos will process the data for the purpose of real time analyses of response rates.

Prior to mailing out further correspondence to potential recruits, Ipsos will upload the details of the cohort, excluding those who opt-out, to NHS England’s Cohort Management System (CMS) and will download reports containing latest vital status and addresses.

Ipsos will extract a pseudonymised subset of the data containing variables necessary for non-response analysis and securely transfer this to UCL. The data will be stored in UCL’s Data Safe Haven.

UCL uses offsite back-up services provided by VIRTUS Data Centres.

UCL stores data on the Cloud provided by Amazon Web Services.

The data will be accessed by authorised personnel via remote access. The data will remain on the servers at UCL at all times. There will be no requirement and no attempt to reidentify individuals when using this data.

Once the fieldwork has concluded, Ipsos will transfer the details, including direct identifiers, of all people who consented to participate to UCL along with the deidentified data for non-respondents and those who declined to participate or opted out of taking part.

Ipsos will retain the data until instructed by UCL to destroy it or contractually required to destroy it. UCL will undertake verification checks on receipt of the data from Ipsos and will instruct deletion when ready. Ipsos are contractually required to delete the data within 28 calendar days from the end of their contract with UCL.

The data will not leave England and Wales at any time.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.


Virus Watch: Understanding community incidence, symptom profiles, and transmission of COVID-19 in relation to population movement and behaviour — DARS-NIC-372269-N8D7Z

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - consent provided by participants of research study, Anonymised - ICO Code Compliant, No, Identifiable (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 – s261(2)(c)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-09-14 — 2021-09-13 2020.10 — 2024.01. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Covid-19 UK Non-hospital Antigen Testing Results (pillar 2)
  2. COVID-19 Second Generation Surveillance System (Beta version)
  3. COVID-19 Second Generation Surveillance System
  4. Civil Registration - Deaths
  5. Emergency Care Data Set (ECDS)
  6. Hospital Episode Statistics Admitted Patient Care
  7. Hospital Episode Statistics Critical Care
  8. Hospital Episode Statistics Outpatients
  9. COVID-19 Vaccination Adverse Reactions
  10. COVID-19 Vaccination Status
  11. HES-ID to MPS-ID HES Outpatients
  12. Civil Registrations of Death
  13. COVID-19 Second Generation Surveillance System (SGSS)
  14. COVID-19 UK Non-hospital Antigen Testing Results (Pillar 2)
  15. Hospital Episode Statistics Admitted Patient Care (HES APC)
  16. Hospital Episode Statistics Critical Care (HES Critical Care)
  17. Hospital Episode Statistics Outpatients (HES OP)
  18. COVID-19 SGSS First Positives (Second Generation Surveillance System)

Objectives:

The Coronavirus (COVID-19) pandemic has caused large numbers of deaths and impacted lives around the world with the closure of schools, workplaces, and limitations on our freedom of movement. Most current knowledge of the COVID-19 comes from observations at the more severe end of the disease in hospitalised patients. There is currently a lack of understanding of COVID-19 community incidence, symptom profile, severity, infectious period, risk factors, strength and duration of immunity, genetic differences in immune response, asymptomatic infection and viral shedding, household and community transmission risk and population behaviours during periods of wellness and illness (including social contact and movement and respiratory hygiene). This information can only be gathered accurately through large scale community studies. Virus Watch is one of the largest of such studies anywhere in the world and will help to inform NHS planning and the national public health response.

Virus Watch is a household community cohort study. Approximately 42,500 participants will be recruited via a postal invitation or using social media platforms, and asked to fill out a baseline questionnaire, followed by weekly and monthly update questionnaires, all online. Information will be gathered on all members of participating households. There is concern about an increased risk of COVID19 infection and death among people who are from a black or minority ethnic or migrant group. Persons from black and minority ethnic (BAME), and some migrant groups will be oversampled.

The approximate cohort size of 42,500 will consist of a targeted recruitment of 12,500 individuals from BAME groups and 30,000 from the general population. Persons from Poland will also be oversampled. This is because Poland is the most common European country of birth for people born abroad and resident in the UK, and the most common nationality in the UK after British according to the ONS. Polish is also the second most common language spoken in England according to the 2011 Census. The Polish population resident in Britain is therefore a sizable and important minority population that the researchers are interested in in terms of their risks of COVID19 infection.

A subset of 10,000 participants will be recruited for swab and blood sampling to estimate the incidence of COVID-19 infections and development of antibody responses. Participants can also choose to submit geotracking data via their mobile phone.

The data has been requested by University College London (UCL) who are acting as the sole data controller who is also acting as the sole data processor.

The primary purpose of linking the Virus Watch questionnaire data to hospital and mortality data held by NHS Digital is to estimate population-based COVID-19 related hospital visits (accident and emergency attendances and admissions) and deaths, to address objective h). A secondary purpose of linkage to HES data is to examine how social distancing measures have affected routine use of health services (eg planned procedures and outpatient appointments), to address objective g).

The primary purpose of linking VirusWatch to PHE Second Generation Surveillance System (SGSS) and National Pathology Exchange (NPEX)data is to identify any laboratory confirmed infections in the cohort (including individuals who are not part of the 10,000 participant-swabbing cohort), addressing objectives a) and i)-k).

The VirusWatch study has multiple objectives:

a) To measure the frequency of respiratory infection syndromes and related behaviours across the population of England & Wales.
b) To compare the impact in different sociodemographic, occupational and ethnic groups
c) To understand reasons underlying differential mortality impact in different ethnic groups
d) To assess the impact of the pandemic control measures on different population groups
e) To monitor population movement and assess the extent to which public contact increases the risk of infection, and social distancing measures decrease the risk.
f) To assess uptake, compliance with and effectiveness of and impact of recommended COVID-19 control measures
g) To assess the impact of social distancing on routine use of health services
h) To measure the impact of infections on hospitalisations and deaths.
i) To measure the incidence of PCR confirmable COVID 19
j) To measure COVID 19 clinical profiles (including the range of symptoms of COVID19 disease and the proportion of infections that are asymptomatic)
k) To measure the proportion of the population infected after each wave of the pandemic
l) To measure the protective effect of antibodies acquired through natural infection to seasonal and pandemic coronavirus.
m) To assess the accuracy of finger prick blood tests for antibodies to COVID-19 for potential use in COVID-19 control and vaccine effectiveness studies.
n) To measure the extent of pre-symptomatic and asymptomatic viral shedding in household contacts.
o) To ensure availability of specimens to measure the protective effect of T and B cell responses and to assess the value of proteomic analysis in assessing vulnerability to severe infection. proposal

The legal basis for processing personal data for this purpose data at UCL falls under Article 6(1)(e) of the General Data Protection Regulations (GDPR), i.e. “a task carried out in the public interest”. It also falls under Article 9(2)(j), “processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”. The processing of data for this study is a task of public interest as it will help with better understanding of the risk posed by COVID-19 such as likelihood of infection, likely symptoms, those at greatest risk of complications and the effect of risk factors such as social contact will help guide proportionate public responses and reinforce public health messaging.

Due to the nature of this study and the urgent national call to set it up as soon as possible, UCL did not involve participants in its design. UCL have previously conducted Patient and Public Involvement to support similar community cohort studies of acute infections using similar methodologies. UCL have engaged the Young Persons' Advisory Group for research at Great Ormond Street Hospital to provide feedback on the Children's Participant Information Sheets. UCL will provide opportunities for survey participants to comment on survey methodology at the first monthly survey and consider revisions based on this. UCL will produce regular newsletters for survey participants.

Expected Benefits:

Virus Watch will provide data relevant to a wide range of audiences involved in pandemic response. Summary data at national and regional level will be presented on open access dashboards so that it is available to all these audiences in a timely way. Audiences include-

1) THOSE PLANNING AND UNDERTAKING PUBLIC HEALTH MEASURES TO MINIMISE TRANSMISSION –Cutting-edge methods to measure contact with others (including time spent at home and work, in social venues, transport modes, conversational contact, household contact) and hand/respiratory hygiene – and determine how these change over time, affect risk of infection and are affected by illness. Development of technologies and pathways for remotely supporting self-testing and self-isolation.
2) THOSE RESPONSIBLE FOR PLANNING THE NHS RESPONSE - Understanding of the number of people affected over time, the range of severity, health care seeking behaviour and the case hospitalisation and mortality ratios will allow better predictions of surges in NHS activity supporting measures such as triaging, cohorting, care outside hospital, cancelling routine activities etc.
3) THOSE PROVIDING FRONT LINE CARE –Improved case definitions by age, sex and ethnicity guiding targeting of diagnostics and contact tracing.
4) ACADEMIC GROUPS INVOLVED IN UNDERSTANDING THE PANDEMIC. Extensive data shared according to the principles of the Joint statement on sharing COVID-19 data.
5) THE GENERAL PUBLIC AND THE MEDIA. Better understanding of the risk posed by COVID-19 such as likelihood of infection, likely symptoms, those at greatest risk of complications and the effect of risk factors such as social contact will help guide proportionate public responses and reinforce public health messaging.

Outputs:

UCL plans to disseminate the outputs through a number of channels:

1) Journal publications (open), including The Lancet, the British Medical Journal (target dates: March 2021, June 2021)
2) Presentations to the Scientific Advisory Group for Emergencies (SAGE), (target dates: October & December 2020, February, April, June 2021); presentation to Department of Health and Social Care (target date: October 2020)
3) Presentations at scientific conferences, including European Respiratory Society Annual Congress, European Society for Paediatric Infections Diseases Scientific meeting, Public Health England Annual conference (target dates: April 2021, September 2021)
4) Regular updates published on the VirusWatch website (http://ucl-virus-watch.net/) and results dashboard, aimed at the general public. These will be published monthly throughout the study.

All outputs will be in aggregate form only with small numbers suppressed in line with the HES analysis guide.

Processing:

UCL will use the Royal Mail Post Office Address File and systematically sample addresses in order to ensure representative samples are taken within each subgroup of interest. There is increasing evidence that people from the BAME group are at greater risk of hospitalisation from COVID-19 and their outcomes are worse. UCL will therefore oversampled participants from BAME groups. UCL will also seek to recruit Polish groups.

UCL will use a commercial company to efficiently send recruitment postcards inviting households to sign up online. UCL anticipate a 25% response rate from the general population yielding approximately 30,000 participants and 15% from the BAME population yielding approx. 12,500 participants (42,500 participants in total). The sample size is required in order to have sufficient statistical power to estimate less
common events, including hospital admissions, in different subgroups, including people from Black and Minority Ethnic backgrounds. Depending on response rate and demographics of those registering, UCL will undertake a second recruitment that allows them to target any underrepresented groups. Invitation postcards will be in English and include a sentence in six languages, pointing participants to the study website containing Patient Information Sheets translated into their language.

All information sheets and consent forms have been translated into 6 languages (Urdu, Bengali, Punjabi, Portuguese, French, Polish). It is not feasible to obtain verbal consent on the phone, given the number of people participating, and in any case written consent is preferred. An online system for consent and participation is the only feasible option for this size of study.

UCL will assess recruitment rates and the representativeness of our sample following the first mail out of 50,000 postcards. If recruitment is lower than expected and or under-representative of the national population the researcher will create a digital recruitment campaign that will use social media adverts on the following platforms: Facebook, Google, Twitter, Instagram, LinkedIn. The social media adverts will have tailored messages that aim to improve our recruitment of BAME communities in addition to achieving an age, sex and geographically representative sample of the UK population. Social media users will receive our recruitment adverts and be directed to our website http://ucl-virus-watch.net/.

In order for a household to be enrolled, they must have an internet connection and email address and all household members must agree to take part. Households will nominate a lead householder with whom the study will communicate and email weekly surveys to. The lead householder needs to be able to read English to support other household members in survey completion. The lead householder will need to be proficient in English in order to answer the weekly and monthly surveys which will be in English only.

The concept of a ‘lead householder’ has been successfully used in previous studies run by UCL, including Fluwatch and Bugwatch (see: https://doi.org/10.1093/ije/dyv370; https://doi.org/10.1136/bmjopen-2018-028676).

UCL will supply NHS Digital with study participants’ Study ID, NHS numbers, names, addresses (including postcodes) and dates of birth for linkage. The cohort would only be submitted to NHS Digital once. The NHS Digital Data linkage team will hold the VirusWatch identifiers throughout the period of the DSA and link the VirusWatch identifiers to HES, Civil Registration Deaths and SGSS datasets on a quarterly basis. Only de-identified data (Study ID and attribute variables from the datasets held by NHS Digital) will be returned to UCL. The linked data supplied by NHS Digital will be uploaded to the DSH by UCL as soon as received. A file transfer mechanism enables information to be transferred into the Safe Haven simply and securely.

UCL are requesting the following data to be linked to the VirusWatch cohort:
• Hospital Episode Statistics (HES) Admitted Patient Care
• HES Critical Care data
• HES Emergency Care Dataset
• HES Outpatient Dataset
• Civil Registration Deaths data - to ensure the researchers capture deaths for all participants, not just those who have a HES record.
• Public Health England (PHE) Second Generation Surveillance System (SGSS) data on confirmed cases of SARS-CoV-2 and other respiratory infections (influenza, respiratory syncytial virus, seasonal coronavirus, adenovirus, rhinovirus, parainfluenza virus, human metapneumovirus).
• National Pathology Exchange (NPEX) (the 'Pillar 2' testing programme) data on results of COVID19 PCR tests carried out by commercial partners (the ‘Pillar 2’ testing programme).

The Virus Watch survey data collection will take place between June 2020 and April 2021. UCL are requesting linked HES and SGSS data for the period January 2020 (the first UK case of COVID-19 was detected in late January) until five years after the end of follow up (2026) as COVID-19 may continue to circulate in the population, and to allow UCL researchers to examine long-term health impacts of COVID-19 infection. UCL will request Civil Registration Deaths data from the start of follow-up up to 5 years after the end of the study.

UCL request that linkage to HES, SGSS and Death data are refreshed quarterly during the period of questionnaire data collection (June 2020 and April 2021). UCL may request less frequent updates after April 2021 (depending on COVID-19 circulation). This will be subject to an amended Data Sharing Agreement with NHS Digital.

All VirusWatch data, including the data requested from NHS Digital will be stored in the UCL Data Safe Haven (DSH). Patient identifying variables (including names and addresses), which are requested from survey participants, will be kept in a different file in the DSH. Identifiers will be kept separate to survey responses, laboratory test results, linked NHS Digital data and geotracking data. The UCL DSH uses Dual Factor Authentication to access and handle data transferred into the DSH service. This ensures that only the named applicants will have access to the data from the DSH.

Only researchers (UCL substantive employees or PhD students), working under appropriate supervision on behalf of the data controller/processor within this agreement will have access to the data and only for the purposes described in this agreement. These individuals are experienced in handling individual-level, sensitive data and complete annual courses in information governance and data protection (a requirement for accessing the UCL DSH).

The data supplied by NHS Digital will not be shared with third parties or linked to any other datasets. UCL have no requirement nor will attempt to re-identify the supplied data.


British Regional Heart Study (BRHS)- data linkage of established cohort to NHS Digital datsets (HES, MHMDS, DIDS) — DARS-NIC-28591-H5Q3X

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 – s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2018-07-01 — 2021-06-30 2019.01 — 2024.01. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), NEWCASTLE UNIVERSITY, UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY OF NEWCASTLE UPON TYNE

Sublicensing allowed: No

Datasets:

  1. Mental Health Services Data Set
  2. Mental Health Minimum Data Set
  3. Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
  4. Hospital Episode Statistics Outpatients
  5. Hospital Episode Statistics Accident and Emergency
  6. Hospital Episode Statistics Critical Care
  7. Hospital Episode Statistics Admitted Patient Care
  8. Diagnostic Imaging Dataset
  9. Bridge file: Hospital Episode Statistics to Diagnostic Imaging Dataset
  10. Emergency Care Data Set (ECDS)
  11. Diagnostic Imaging Data Set (DID)
  12. Hospital Episode Statistics Accident and Emergency (HES A and E)
  13. Hospital Episode Statistics Admitted Patient Care (HES APC)
  14. Hospital Episode Statistics Critical Care (HES Critical Care)
  15. Hospital Episode Statistics Outpatients (HES OP)
  16. Mental Health Minimum Data Set (MHMDS)
  17. Mental Health Services Data Set (MHSDS)

Objectives:

The British Regional Heart Study (BRHS) is an established, long-term cohort study of cardiovascular disease and other common chronic diseases - the study comprises older men, currently aged 75-94 years, who originally joined the study in 1978. The BRHS currently obtain cancer registration data, mortality data and use the tracing service under NIC-148411-Q64H8 (MR104). The data held under NIC-148411-Q64H8 will not be linked to the data disseminated under this agreement. Morbidity rates in this study population are exceptionally high, with many study participants developing cardiovascular disease (CVD) and other physical illnesses; fractures and dementia are also major health problems. In order to obtain an accurate assessment of chronic disease outcomes, the researchers are seeking to supplement information on the cohort with disease information from hospital consultations and admissions (HES Data, MHMDS data and DIDs data).

The additional data provided by NHS Digital will be used to inform and develop a larger programme of research on the prevention of CVD (CHD and stroke), heart failure and CVD related ageing conditions including dementia, frailty and physical disability. For example, there is growing evidence that dementia and CVD share common risk factors. BRHS data resource now includes a wide range of novel risk factors measured at both 60-79 years and at 72-91 years (with blood stored for the measurement of further markers). The data from NHS Digital will enhance the study and will lay the ground for investigation into the aetiology, mechanisms and prevention of these age-related conditions in older men and allow us to test new hypotheses in cardiovascular ageing.

Linking the NHS Digital data with the BRHS cohort database will strengthen/enhance the data on chronic disease diagnoses and on health service use. The researchers will use these NHS Digital data, using Study ID, along with data already available in the cohort study on social, biological, behavioural and environmental determinants of health - this will allow the researchers to undertake detailed research on the determinants of cardiovascular disease and other chronic diseases in later life.

The overarching objectives/purpose of this data request is to enhance the BRHS cohort study by obtaining more robust detailed data on disease outcomes in order to research ways to prevent CVD, heart failure, dementia and disability in older ages. The researchers will link the NHS Digital data to pseudonymised data in the BRHS cohort study, which has been obtained (from the cohort) over the last 40 years - this includes mortality, cancer, postal questionnaire data completed by the participants, questionnaire data collected from General Practice and data collected during the physical assessments in 1978-1980, 1998-2000 and 2010-2012.

The following scientific/research objectives will be investigated in the BRHS data based on the detailed disease outcomes data from NHS Digital -
1. Prediction of CVD risk in older people - To investigate the use of non-invasive arterial markers and novel blood markers reflecting a range of biological pathways in improving CVD risk prediction in older men.
2. Lifestyle determinants of CVD in older age - To assess patterns of key health behaviour (physical activity, obesity, diet) in influencing CVD morbidity and mortality in both men with and without established CVD.
3. Modifiable risk factors and dementia - To investigate lifestyle factors measured in mid-life and older age (obesity, smoking, physical activity) as well as diet quality and nutritional markers in older age and risk of developing dementia.
4. Socioeconomic determinants of cardiovascular aging - To investigate the impact of socioeconomic factors that are important in preventing CVD and dementia in older people.
5. Dementia and CVD – To investigate shared risk factors and mechanistic pathways underlying CVD and dementia and improving early identification of CVD and dementia. This research will help develop strategies to prevent dementia and CVD.
6. Determinants of Heart failure – (i) To investigate pathways to prevention of heart failure distinguishing between reduced ejection heart failure and preserved ejection heart failure which is more common in older adults; (ii) to develop prediction risk scores for use in clinical practice to identify older adults at high risk of developing heart failure.
7. Later life determinants of stroke – To differentiate subtypes of strokes and distinguish risk factors for ischaemic and hemorrhagic strokes.
8. Physical disability and frailty – To identify social, lifestyle and biological factors that affect physical functioning and frailty and identify common pathways underlying CVD and frailty which can inform efforts to prevent the development of disability in older people with CVD.
9. Type 2 Diabetes, CVD and dementia - To examine the influence of duration of diabetes on CVD risk and dementia and identify metabolic pathways linking diabetes with dementia.
10. Improving clinical outcome – To identify and inform ways of evaluating and improving clinical outcomes in patients with CVD and/or dementia such as reducing hospitalisations and mortality.

Expected Benefits:

Enhanced data on health and disease outcomes in the BHRS cohort study will allow further detailed research on ways to prevent chronic diseases in older ages.

This will lead to development of the research evidence-based that is needed to inform clinical guidelines and health policies to improve the health of ageing populations.

Global trends of ageing populations will acutely increase the health and social care burdens on individuals and society from chronic diseases such as cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life - these chronic diseases present both health and social care challenges in older populations. Therefore, research in this cohort study BRHS aims to establish the contributions of potentially important factors (obesity, diabetes, health behaviours, environmental and social factors) to prevent cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life - this research evidence is crucially needed to inform health policies and clinical guidelines to reduce the health and social care burden of chronic diseases in older people. The long term goal (>5 years) of the research is to lead to improved care and prevention of chronic diseases and disability in older people.

The BRHS has a track record of providing high quality evidence to improve the health of the public in the UK and internationally. To date (using data received under NIC-148411-Q64H8), the study has published over 500 peer reviewed research papers, providing high quality evidence about the epidemiology of these conditions and improving understanding on how to manage, treat and prevent them. Importantly, these papers have informed evidence based strategies to reduce the health and social care burden in older populations, as outlined in detail in section “Specific output” above. The researchers have contributed to a range of influential UK and international clinical guidelines for management and treatment of important chronic conditions including CHD, stroke, angina, arrhythmias, and diabetes which together cause substantial burdens of ill health in UK and globally.

The BRHS have also contributed to public health guidelines and clinical guidelines about the modification of important cardiovascular disease risk factors (e.g. lipids, obesity, alcohol use, physical activity, smoking and passive smoking).

Outputs:

Short term goals - 1 year

Enhancing the BRHS cohort study with detailed data on disease outcomes based on HES, MHSDS and DIDs data requested under this iteration of the Data Sharing Agreement.

Medium term goals - 2-5 years

Peer reviewed publications in scientific and clinical journals based on research objectives mentioned in objective for processing section.

Long term goals - 5 years and over

Adding to the scientific evidence base and knowledge to inform clinical guidelines and health policy.

The specific outputs from the use of the data will be to generate further high quality research evidence about prevention of chronic diseases and to improve the health of older populations. Importantly, the NHS Digital data linkage requested will substantially strengthen and enhance the BRHS data on chronic disease events and diagnosis and on health service use. The new data linkage will substantially increase the quality of data relating to treatment and management of disease events permitting the researchers to investigate the disease endpoints in greater detail than has been done previously (e.g. understanding treatment received, recurrence of events and categorising sub-types of cardio vascular events). Linking the existing BRHS database to NHS Digital data will permit the research into a wide range of public health relevant topics. The potential benefits for the prevention of cardiovascular disease, diabetes, dementia and other chronic diseases and disability in later life are substantial. Target dates will run from the time of acquiring the data until 2019 with plans to further extend funding for the study.

The BRHS cohort study has previously led to the development of evidence, knowledge and translation of evidence into health policies, as described below:

More than 500 peer-reviewed reports have already been published based on the study which uses mortality data from NHS Digital. It is hoped that research from this new study will be published and utilised in the same way.

Research from the BRHS has been used to shape and change many policies on cardiovascular disease prevention, both nationally and internationally. The BRHS provide outputs in the form of peer reviewed publications from the research to directly funding bodies and policy makers (Department of Health, British Heart Foundation, National Institute of Health Research, Medical Research Council, UK Health Forum), clinicians, public health specialists and other health researchers who then use the evidence to develop preventive strategies.

The researchers have a track record of research findings informing health policies on a range of issues related to primary and secondary prevention of cardiovascular disease (CVD), management of stroke, angina, arrhythmias, and diabetes and modification of risk factors e.g. lipids, obesity, alcohol use, physical activity, smoking and passive smoking.

The Study findings will be cited in reports by a range of influential national and international public sector bodies including the UK House of Commons Health Select Committee, the UK Department of Health, the U.S. Surgeon General (whose reports inform health policies both in USA and other countries around the world) and the World Health Organisation (e.g. their Guidelines for assessment and management of cardiovascular risk).

Previous research is also cited in guidelines produced by professional organisations for treatment of specific chronic conditions, e.g. NICE guidelines, American Heart Association guidelines for prevention of stroke and transient ischemic attack, for management of cardiovascular disease, and management of patients with ventricular arrhythmias, Australian guidelines for management of cardiovascular disease risk, Joint British Societies management of cardiovascular disease guidelines, Endocrine Society guidelines on hypertriglyceridemia and obesity.

Evidence generated from the research has also been used to support local public health programmes, for example, in developing initiatives for primary prevention of CVD and dementia in South East London. The research findings have been published in open access peer-reviewed scientific journals related to public health.

Processing:

The BRHS currently receives data from three sources:
1. Study participants- Physical Examinations - 1978-80, 1998-2000, 2010-2012 and regular postal questionnaires - no further personal identifiers are collected. Participants are asked to provide their date of birth when returning the postal questionnaire, to ensure the form in completed by the intended recipient.
2. GP record review - data collected annually directly from participants' GP using a questionnaire sent to the GP. An update of the participant address is requested so that the researchers can continue to contact cohort members.
3. NHS Digital -Participants flagged in 1978-80 and the study receives Mortality notification & Cancer registration on a monthly basis via the Data Exchange Service (received under NIC-148411-Q64H8). This cohort is the original cohort, no new participants are added.

The BRHS cohort has been followed up since 1978. The researchers are requesting historic NHS Digital data (HES, MHSDS and DIDs) as far back as possible for this cohort i.e. all the available years of data) and on an annual basis going forward. Only those members of the cohort who have provided consent will be followed up for the research purposes mentioned in the objective for processing section. These data will be used to enhance the data already held in the cohort and help to produce robust research findings. Data are requested for this cohort going as far back as possible because this will provide detailed information necessary for research on cardiovascular disease and dementia. A key feature of a cohort study is that health outcomes are assessed over time which provides information on incidence (development) of disease. Therefore data on all available years are requested so as to have complete information on development of diseases - this is needed in order to investigate the research objectives which are to investigate determinants and prevention of diseases.

Without all the retrospective data requested, the research will be limited to only assessing the prevalence of diseases and lead to biased and limited data analysis.

Processing of data for the linkage requested:

The BRHS will provide NHS Digital with Study ID, NHS number, DOB, Sex & last known postcode for linkage to the data requested from NHS Digital for 4,123 consented participants.

NHS Digital will return a pseudonymised dataset to the applicant containing study ID and match rank code.

The Data manager will then link this NHS Digital pseudonymised dataset, using the Study ID, to the BRHS cohort data ID for analysis. The NHS Digital data will not be linked back to any personal identifiers.

The pseudonymised dataset will be stored on UCL’s Sync & Share network drives which are only accessible with a UCL user ID and password. The pseudonymised data will then be made available to the research team of Medical Statisticians, Epidemiologists and Public Health clinicians, to carry out their research analysis. All the researchers working on the data are substantive employees of UCL.

For data from the Mental Health (MHSDS, MHLDDS, MHMDS) data sets, and any Mental Health data linked to HES or SUS, the following disclosure control rules must be applied:
• National-level figures only may be presented unrounded, without small number suppression
• Suppress all numbers between 0 and 5
• Round all other numbers to the nearest 5
• Percentages can be calculated based on unrounded values, but need to be rounded to the nearest integer in any outputs
• In addition for Learning Disability data in Mental Health (MHSDS, MHLDDS, MHMDS), the England-level data also must apply the suppression of all numbers between 0 and 5, and rounding of other numbers to the nearest 5.

All small numbers under 5 must be suppressed in line with the HES analysis guide.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

All outputs will be restricted to aggregated data with small numbers suppressed in line with the HES analysis guide. No publications/outputs from the BRHS have ever presented or will present data which allow the identification of individuals. All data presentation is based on groups of subjects (generally >50 subjects, often considerably larger numbers).

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.


MR1362: Extension of NIC-349413-F1J1N - Next Steps Cohort Study — DARS-NIC-15226-X7Z9R

Type of data: information not disclosed for TRE projects

Opt outs honoured: N, Yes - patient objections upheld, Identifiable, Yes (Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-09-25 — 2022-09-24 2018.03 — 2024.01. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - List Cleaning Report
  2. Civil Registration - Deaths
  3. Demographics
  4. MRIS - Cause of Death Report
  5. MRIS - Cohort Event Notification Report
  6. MRIS - Flagging Current Status Report
  7. Civil Registrations of Death

Objectives:

Next Steps Longitudinal Study of Young People in England (LSYPE) is an established longitudinal study which has followed the lives of 15,620 people born in 1989/90, since year 9 of secondary school. Study members were interviewed annually between 2004 and 2010 to map their transitions through education and into adulthood and the labour market. Therefore, LSYPE is the largest and most detailed research study of its kind.

The most recent round of data collection - Next Steps Age 25 survey - took place between August 2015 and September 2016. Information was collected from 7,707 cohort members on many aspects of cohort members’ lives such as education, employment, health and well-being, relationships and family life, housing and finances. Additionally, during the Age 25 survey, a wide range of data linkage consents were collected, including consent to health records linkage held by NHS.

The study was previously managed by the Department of Education (DfE). In 2013 the Economic and Social Research Council took over the funding and the study management legally transferred to the Centre for Longitudinal Studies (CLS) at the University College London Institute of Education.

CLS have already received data under the original data agreement (NIC-316681-W7P2R) and, under the subsequent amendment (NIC-349413-F1J1N), were able to pass data to their contracted fieldwork agency NATCEN to allow them to initiate/conduct the survey.

Next Steps was conducted annually between 2004 and 2010. Data collection focused on young people’s transitions into further/higher education and the labour market or to other outcomes, such as parenthood. The next wave took place in 2015 when cohort members were aged 24/25 years. The Age 25 survey gathered information about the lives of the cohort including education, employment, economic circumstances, family life, physical and emotional health and well-being, social participation and attitudes.

The ongoing success of the study depends on re-establishing and maintaining contact with as many study members as possible.

The aim of being provided with the address details and other up to date identifiers for the cohort was to trace as many study members as possible in advance of the next wave of fieldwork.

CLS supplied NHS Digital with a file of study members and their last known address, extracted from the CLS address database. CLS asked that these details are matched with NHS Registration Data and registered addresses supplied, where available.

The data supplied was entered into the secure address database and is used to maintain contact with study members and to invite them to take part in each fieldwork wave. When study members were contacted to invite them to participate in the Next Steps Age 25 study, it was made explicitly clear that they can inform CLS that they no longer wish to participate in the study and they will not be contacted again.

To conduct the age 25 survey, CLS contracted an external supplier NatCen Social Research (the trading name of the National Centre for Social Research - www.natcen.ac.uk) to carry out the individual cohort study members’ interviews’.

The survey is now completed and the aggregated outputs from the survey have now been deposited with the UK Data Archive, located at the University of Essex, no patient identifiable data is deposited. Please note that the data files supplied from NHS Digital, as part of this application, have been processed within CLS and entered into CLS’s secure address database. They have been used to maintain contact with study members and to invite them to take part in each fieldwork wave – this data is not sent to the UK Data Service.

NatCen Social Research, were an external Data Processor and carried out the survey fieldwork and associated mailings for the Next Steps Longitudinal Study of Young People in England (LSYPE) Age 25 survey – specifically they were contracted to carry out: (1) Email and postal mailings to LSYPE cohort members about the study; (2) Interviews with Next Steps (LSYPE) cohort members. As the survey is now completed, the contract with NatCen is also ended and therefore NatCen are no longer acting as a data processor for CLS, and any data files have been securely deleted from NatCen’s systems.

CLS also require access re-instated to NHS Numbers for the cohort participants so that future matching and linkage exercises including those related to a Next Steps linked data application (NIC-51342), a separate data sharing agreement which provides HES data linked to the cohort).

Yielded Benefits:

The age 25 survey data is already providing important research evidence on transitions out of education and into early adult life, informing a range of key interlinked policy questions relating to higher education, employment, housing and family formation, and health. Data from the age 25 survey was deposited at the UK Data Service in June 2017 and have already been downloaded for over 100 research projects in many disciplines including economics, education and sociology; examples of outputs from those projects can be found at the Next Steps home page (https://cls.ucl.ac.uk/cls-studies/next-steps/), as well as publications in the Journal of Physical Activity and Health, the Journal of Adolescence and the European Journal of Public Health, and others. Its influence and impact will grow over the next few years, as it is used for research and policy on a wide range of different issues, and as the existing data is enhanced and augmented, particularly with linked administrative data. The Next Steps study has shown to be a strong resource to researchers in addressing different developing aspects of life for young people. Through this research, Next Steps has contributed to the restructuring of public opinion and policy focused on young people in England. Next Steps being used in the reform of vocational education for young people: In 2011 the Department for Education has commissioned an investigative work to find out the effectiveness of ‘vocational’ education system in the UK in helping young adults securing jobs. The study, conducted by Professor Alison Wolf using data from Next Steps study showed that young adults aged 16-19 were actively seeking work, but that around a third to a half of them struggled to find appropriate courses and jobs, and as a result changed occupations frequently and spent periods of time not in work, education or training. Based on these findings, Professor Wolf was able to make 27 specific recommendations of how to improve practical education and training opportunities for young people that will help them to get jobs with good opportunities for progression. More information on this report can be found at the link: - https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/180504/DFE-00031-2011.pdf - The government committed to acting on all the recommendations in Professor Wolf’s report with comments addressing the input from the Wolf Report. Next Steps being used in Government Policy on Youth Unemployment. Findings from Next Steps have been used to inform the Government’s strategy on getting young people into education, employment or training. The struggling economic climate has often been cited as the reason for the large number of young adults not in education, employment or training (NEET). However, new research using data from Next Steps have shown that there are other factors that influence whether or not a person spends NEET. The findings showed that education attainment at age 16 is one of the most determining factors in determining a person’s future path. Forty-five per cent of those with no GCSEs had spent more than a year NEET by the time they turned 18, compared to just 4 per cent of those with five or more GCSEs at A*-C. Building Engagement, Building Futures lays out the Government’s plans to tackle the causes of youth unemployment. This strategy proposes to maximise the participation of 16-24 years old in education, training and work. More info can be found here - https://www.gov.uk/government/publications/building-engagement-building-futures - The strategy was developed by the Department for Work and Pensions, the Department for Education, and the Department for Business, Innovation and Skills. It comes in response to recommendations of the influential Wolf Report, which also draws on evidence from Next Steps. Next Steps being used in Anti-Bullying Campaigns Findings from Next Steps have been used in several anti-bullying campaigns and initiatives, as well as guidance for teachers and schools about how to stops bullying. And the efforts have paid off. A recent study has shown that secondary school pupils today are less likely to be bullied. Ten thousand fewer pupils are being bullied every day than 10 years ago, a major new study of secondary school pupils has revealed. The Department of Education report compares the experiences of Next Steps study members to members of Our Future, a new study which started following Year 9 pupils in 2013. This landmark research, which involved tens of thousands of young people from 2004 and 2013, is one of the largest of its kind ever undertaken and highlighted a fall in bullying since participants of Next Steps were recruited for the study. Initial findings from the age 25 data, produced and published by CLS, have also contributed to political debate in relation to the labour market conditions for this generation, with Members of Parliament referring to findings on the negative impact of zero hours contracts on health in Prime Ministers’ questions in July 2017. Further examples of what has been learnt from the study includes: Education: Next Steps has provided information on the factors that influence young people's performance at school, including the fact that attainment gaps between young people from rich and poor backgrounds emerged early in life and were very large by the time GCSEs were taken. Findings were used in setting up the Education Maintenance Allowance which is a scheme which helps young people from low income families with the costs of travel, books and equipment for school or college. Employment: Next Steps has contributed to the understanding of young people's experiences of the labour market. It has shown that young people's educational attainment at age 16 is the most important factor affecting whether they are in education, training or employment at age 18. In 2011 the government used the findings in their policy on tackling the root causes of youth unemployment. Social exclusion linked to academic struggles for young people in poor health: according to new research from Next Steps, teenagers with poor physical and mental health are often excluded from social circles and activities, which can have a knock-on effect on their performance at school and in the labour market. More information can be found here - https://nextstepsstudy.org.uk/social-exclusion-linked-to-academic-struggles-for-young-people-in-poor-health/

Expected Benefits:

The expected measurable benefits from the original agreement:

The study produces rich, longitudinal, policy-relevant data, currently unavailable elsewhere, for a large, representative sample of young adults. LSYPE data is widely used by policy makers to evaluate and develop policy and improve services for young people and also by academic researchers to chart and understand social change.

The information provided by cohort members provides valuable evidence for the research and policy community about the cohort’s transitions out of education and into early adult life. To enhance the research resource for secondary users, a fully documented, anonymised dataset has been archived with the UK Data Service in May 2017.

Next Steps Age 25 survey data will enrich the already deposited data for the cohort (waves 1 to 7) and is expected to be particularly valuable for the research community, including researchers in health and social care, providing rich survey data on a range of different domains of young people’s lives. Particularly beneficial is the opportunity for a life course approach and to follow young people’s experiences over time to analyse later life outcomes.

Next Steps data is a resource with great potential for the research and policy community, and the information collected on health and its social determinants widens its potential value for health research and policy interventions. Through the set up at the UK Data Service, researchers are able to apply and carry out research utilising the established link to benefit health and social care.

Next Steps Age 25 Survey data has been deposited with the UKDS and the cohort members’ health is an important aspect in the Age 25 Sweep. Cohort members were asked a range of questions about their physical and emotional health and wellbeing and CLS is currently looking at initial findings on probable mental ill health at age 25 and its association with a number of potential risk factors. There is, however, a great deal more information about potential underlying determinants, in this and the earlier sweeps of Next Steps, available for researchers via the UKDS.

This request is to extend the Data Sharing Agreement. Retaining contact details of non-responders (to an annual mail-out) will enable the researcher to try and re-establish contact before the next wave of the longitudinal study (date to be confirmed) to be able to continue with the research.

Outputs:

Previous outputs from original agreement:

On receipt of the data, CLS processed the files and loaded more recent addresses to the database. Contact was made with the cohort members via the contracted fieldwork agency, NatCen, inviting them to take part in the survey. CLS have posted a participant survey information pack to all cohort members announcing the imminent launch of the survey. It will be made explicitly clear to study members that they can withdraw from the study if they no longer wish to participate. CLS: Participant contact information is held in a secure address database at the Centre for Longitudinal Studies. Any participants choosing not to take part in the study are flagged on this database with a code denoting whether their refusal is temporary (i.e. to this particular wave of data collection) or permanent (i.e. they wish to have no further involvement in the study). Anonymised survey data and confidential data from the address database are retained unless the participant specifically asks us not to, in which case this data is deleted.

As mentioned earlier in this section, CLS have contracted an external supplier (& Data Processor) NatCen Social Research to carry out the survey fieldwork and associated mailings for the Next Steps (LSYPE) Age 25 survey – specifically to carry out: (1) Email and postal mailings to LSYPE cohort members about the study; (2) Interviews with Next Steps (LSYPE) cohort members. These activities have been completed and the data files will be securely deleted from NatCen Social Research systems - this is in the process of being completed and a special condition has been added stating that NATCEN will delete the data within 3 month of the signing of this new agreement.

The fully documented, anonymised research dataset was archived with the UK Data Service in early-2017 to provide a strategically important resource for UK Social Science, including researchers in health and social care.

Extension request 2018:
There will be no further outputs at this stage, this request is to extend the Data Sharing Agreement to enable the researcher to retain the cohort contact details so that they can be contacted at the next wave of the longitudinal study (date to be confirmed) to be able to continue with the research.

Processing:

Previous processing activities from the previous agreement:

ACTIVITY 1. NHS address tracing. CLS wished to use NHS Digital patient status and tracking products which uses NHS registration data to trace as many study members as possible, either by finding new address details or verifying existing address details for the cohort.

1.1 CLS supplied NHS Digital with a file of around 15,600 cohort members to match to the NHS data. The file supplied only contained eligible study members who had participated in at least one wave of Next Steps. It did not include study members known to have died or to have withdrawn from the study. The file contained the following data items:
- CLS identifier,
- First name,
- Last name,
- Middle name (where available),
- Date of birth,
- Sex,
- Last known address, and postcode,
NHS numbers were not available for any study members.

1.2 CLS required all 15,600 cases to be sent for auto-matching.

1.3 Once the auto-matching process was complete, CLS reviewed the results and took a decision about which cases should be put forward for operator matching. CLS thought that any cases classified as status: 'gone away' or ‘unconfirmed address’ (1,026 and 3,367 cases respectively) were likely sub-groups for operator matching.

1.4. NHS Digital supplied the following details to CLS.
- CLS identifier,
- Latest surname,
- Latest forename,
- Latest middle name (where available),
- Date of birth,
- Gender,
- Latest address and postcode,
- Fact of Death
- Date of address registration or update.

In addition to the receipt of any 'new' matched address information for the cohort members, CLS required NHS Digital to add an additional variable that described the outcome of the matching process to the data that is returned to them. This additional variable allocated each cohort member to one of the following three categories:
• new/different address found,
• existing address confirmed,
• no match found.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.

Extension request 2018:
There will be no further processing at this stage, this request is to extend the Data Sharing Agreement to enable the researcher to retain the cohort contact details so that they can be contacted at the next wave of the longitudinal study (date to be confirmed) to be able to continue with the research.


Advancing Survivorship after Cancer: Outcomes Trial (ODR1819_039) — DARS-NIC-656825-X7T4K

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, No (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-08-07 — 2026-08-06 2023.12 — 2023.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. NDRS Cancer Registrations

Yielded Benefits:

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). As such there are some yielded benefits to be observed from the access to the data for the study prior to NHS England becoming data controller. These yielded benefits are noted below; Previously disseminated data has been linked to the trial data so that the participants can be correctly classified based on their cancer diagnosis. This has been used in the analysis for a number of papers which UCL are due to submit to peer reviewed journals in the near future including the main trial results (up to 6 month follow-up) are due to be submitted soon. Preliminary findings suggest the intervention was successful at supporting patients to improve their health behaviours.

Expected Benefits:

Breast, prostate and colorectal cancer are 3 of the most common cancers diagnosed each year with approximately 150000 new diagnoses a year in the UK across these three cancer types. This intervention involves a booklet and feedback from a health professional about health behaviours and the potential benefits of making healthy changes to these. If this intervention is shown to benefit cancer patients, it can be incorporated into patient care so that all patients have access to it. This could then improve health behaviours in this group by helping them to move closer to meeting public health recommendations. Such improvements have the potential to improve their cancer outcomes as well as other health conditions. The NHS England data will allow UCL to assess whether the intervention has improved cancer outcomes and survival. If the intervention has resulted in these improvements this could lead to further work focused on how to implement this intervention within the NHS both locally and nationally. The researchers will disseminate the findings widely and discuss with hospitals and funders (including relevant charities such as WCRF, Cancer Research UK and Macmillan). The next steps to work towards providing this intervention as apart of standard care. The aim is that this will happen during 2023.

Outputs:

The data on cancer diagnoses disseminated in the first data share will be used to describe the sample in all analyses and subsequent papers, in peer reviewed journals using the trial data. This main trial paper is still in progress.

The data on cancer diagnoses and mortality will be used as part of the trial analysis of the impact of the intervention which will be published as a paper. It will hopefully also be used in further analyses linking variables UCL collected data on in the trial to longer term cancer outcomes and survival. These will all be published as research papers in peer reviewed journals.

A report of the main results will be sent to ASCOT Study participants via a newsletter and shared on the study website once the data has been analysed. As more subsequent studies are published, participants will be signposted to the study website.

These results will be presented at other research institutions, at national and international conferences (e.g. UK Society for Behavioural Medicine, American Society of Clinical Oncology, International Society for Behavioural Nutrition and Physical Activity) and at patient-facing events (e.g. charity hosted), and will be shared with the recruitment sites via email. UCL may seek additional funding to explore the potential to implement their intervention within the NHS over the long-term, which could involve engagement with local and national policymakers.

UCL hope to complete the majority of this dissemination before the end of August 2023 in line with project funding however some research will continue beyond this, in particular the work linking to later cancer events and mortality.

No participants will be identifiable in this data. Data will be aggregated with small numbers supressed such that averages are presented, e.g. number of participants with each type of cancer, average time since diagnosis and the different variables will not be linked.

Processing:

Eligible patients were identified by searching the electronic cancer databases (e.g. Somerset or Info-flex) of each of the participating NHS trusts cited in objective for processing. The initial list was drawn up by an appropriate member of staff at the trusts (e.g. a nurse or data analyst). The list was carefully cross-checked (via surgical diaries or the multidisciplinary team) to ensure that patients are still alive and have a diagnosis of breast, prostate or colorectal cancer.

Once the list of eligible patients had been finalised the NHS sites only shared a list of finalised ID numbers and cancer types with UCL. The research team at UCL could then issue the staff at the trusts with the appropriate number of copies of the ASCOT initial patient survey (health and lifestyle questionnaire), the letter from the consultant and envelopes. At the back of the survey there was an invitation for participants to leave their details if they would be interested in hearing more about a trial of a lifestyle intervention for cancer patients. The NHS Trusts then sent the letter and survey together in the envelope to the patients to ensure that the research team at UCL did not have initial access to patient identifiable information unless a participant chose to leave their details on the initial survey invitation. All subsequent questionnaires for consented patients for 3/6 and 2 year follow ups were sent by UCL.


Staff at the NHS sites kept a copy of the list of patients with allocated ID numbers that were printed on the initial survey. Completed initial surveys were returned by patients directly to UCL. Staff at UCL then informed the NHS sites of the ID numbers that were returned so reminders could be sent to any patients who hadn’t completed the initial survey. When surveys were returned UCL checked eligibility for the trial for participants who chose to leave their contact details on the survey. If participants were eligible, they were sent an information sheet and consent form for the trial and could consent.

UCL will share trial ID numbers, NHS numbers, names, date of birth, sex and postcode of the enrolled consented participants to allow NHS England to identify the trial participants in the NCRAS data. NHS England will then return the trial ID numbers of data on cancer diagnosis, hospital records, treatment information, health status and mortality for these participants. The patient-level cancer registration data is disseminated from NHS England to UCL’s data safe haven via SEFT (secure electronic file transfer).

UCL will use cancer registration data collected by the National Cancer Registration and Analysis Service (NCRAS; cancer registry in England) to allow them to group the sample population based on the cancer type and stage at diagnosis. The study team will also use the data (on cancer diagnoses and mortality) to compare the experimental groups (those who received the intervention and the control group) to assess if the intervention had a positive impact on these outcomes. The study team are also interested in additional related research questions, for example, exploring whether relationships between health behaviours and survival are moderated by patient variables (like patient reported outcomes of anxiety/depression) to determine if any groups are especially impacted. There is a funded UCL PhD student (who is part of the ASCOT team) who will explore these questions, as well as continuing other planned ASCOT analyses. The disseminated data will be integrated into the trial dataset so that cancer outcomes and survival data become an outcome that can be analysed in relation to various data that the study team collected from participants during the trial.

Data processing will only be carried out by employees of UCL and one enrolled PhD student. All of those carrying out data processing via the safe data haven will complete yearly training on information governance, data protection and confidentiality. The 2nd Principle Investigator (PI) is based from the University of Leeds but will not have access to data disseminated under this agreement. The 2nd PI was involved in the study design plans for analysis and will contribute towards paper writing and will undertake these duties under an honorary contract with UCL.

ASCOT began in 2015 and data collection from participants in the form of the ASCOT patient questionnaires ended in 2021. However, the trial is not considered complete until the study team have received the final NCRAS data from NHS England. UCL plan to examine co-morbidities up to ten years after participants consented. UCL will therefore retain personal identifiers until this time (2028-2029 at the latest). After which UCL will destroy all identifiable data by deleting the database which links patient ID's to patient identifiers, making all the remaining data anonymous to the UCL study team.


All data collected as part of the ASCOT patient questionnaires is pseudonymised using participant ID numbers as soon as the surveys are received. Consented patient identifiable information is stored separately to their ID numbers in a locked filing cabinet in the Department of Epidemiology and Public Health at UCL.

Electronic Data collected as part of the ASCOT patient questionnaire and the ASCOT trial is stored in the data safe haven (a secure environment) by the research team in UCL’s Department of Behavioural Science and Health and accessed remotely by the study team for statistical analysis. The study team can look at the data on the server but are prohibited from moving it to any other machines.

The role of the listed joint data processor VIRTUS data centre is used exclusively for secure server storage purposes only. VIRTUS only supply the physical location for storage. VIRTUS operate 7 layers of physical security on site, including perimeter fencing, access control, CCTV external and internal and restricted pass code access. Staff at VIRTUS data centres will not have access to NHS England data.

The study team will process, store and dispose of patient identifiable information (names and contact details) in accordance with all applicable legal and regulatory requirements, including the Data Protection Act 2018 and any amendments thereto.

All data used in publications and outputs will be completely anonymous using aggregated data with small numbers supressed.


1970 British Cohort Study - Tracing — DARS-NIC-129836-D5F3W

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable, Yes (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7); Other-National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2022-10-07 — 2025-10-06 2022.12 — 2023.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Demographics

Objectives:

The Centre for Longitudinal Studies (CLS) at University College London (UCL) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. It is responsible for four of Britain's internationally renowned longitudinal cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study (BCS70), the Next Steps, and the Millennium Cohort Study (MCS). All these studies are following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being.

This Data Sharing Agreement is specific to the 1970 British Cohort Study (BCS70).

The purpose of this Agreement is to update CLS' database with new addresses . CLS is requesting to receive updated addresses for the BCS70 cohort members on an annual basis. The new addresses will be used to invite participants to take part in the upcoming survey it will also be used to communicate with participants in between sweeps for various other activities related to their participation in survey, for example, pilot surveys, participant engagement, birthday communications etc.

Background
The British Cohort Study 1970 (BCS) is one of Britain’s world renowned national longitudinal birth cohort studies. It follows a large sample of individuals born over a limited period of time (all those born in one week in 1970) through the course of their lives, charting the effects of events and circumstances in early life on outcomes and achievements later on. The study has its origins in the British Births Survey in which information was gathered about almost 17,500 babies. The original study focused on the circumstances and outcomes of birth but since then the study has broadened in scope to map all aspects of health, education, social and economic development. The Study is funded by the Economic and Social Research Council (ESRC).

Since 1970 there have been nine attempts to gather information from the whole cohort. Over time, the scope of enquiry has broadened from a medical focus at birth, to encompass physical and educational development at the age of five, physical, educational and social development at the ages of ten and sixteen, and then to include economic development and other wider factors at ages 26, 30, 34, 38, 42 and 46. The current sweep age 51 was scheduled to begin in 2020. Due to the pandemic, it had to be postponed. It is now underway and is scheduled to continue until early 2023.

The Sweep age 51 will provide the opportunity to collect a range of information from cohort members to aid the understanding of midlife outcomes across multiple life domains and their lifetime determinants. This data collection will build on the extensive data collected from birth and across the lifetime of cohort members and will facilitate comparisons with other generations, particularly the 1958 cohort at 50, and the 1946 cohort at 53, allowing for the study of social change. The data will be of interest to researchers working in a wide range of disciplines, including population health and epidemiology, economics, sociology, demography, psychology and others. It has the potential to inform a wide range of policies, including relating to work, health, relationships, and civic participation. It will include face-to-face interview with cognitive assessments (some interviews may be carried out via video link rather than in person), paper self-completion questionnaire and online diet questionnaire. The interview and paper self-completion questionnaire will cover the following three broad themes:

- Family, relationships and identity: including topics such as social networks, relationships with partners, parents, children, friends, neighbourhood, social and cultural capital, social and political participation, attitudes and values, religion, and expectations.
- Finances and employment: including topics such as work, income, wealth (savings and debts, pensions, and housing), inheritance (receiving and giving) and other transfers, and education.
- Health, wellbeing and cognition: including topics such as physical health, mental health, medical care, medication, smoking, drinking, diet, exercise, and cognitive function.

Data Summary
CLS are requesting access to record level, identifiable data linked to the cohort from the following datasets:
- Demographics dataset

Aim
Of the approximately 17,500 individuals that have ever participated in the study there will always be a number of individuals for whom the Centre for Longitudinal Studies (CLS) at University College London will not have confirmed addresses at the time of carrying out the next survey.

The ongoing success of the study depends on maintaining contact with as large a number of study members as possible. Therefore, CLS are seeking permission to be supplied with updated addresses for BCS70 cohort members. CLS believe that a substantial number of these individuals would be willing to participate in the Age 51 survey if they could be contacted. Previous efforts to re-establish contact for earlier surveys for the BCS70 cohort study have been very successful, using NHS Digital data to assist with maximising participation in the Surveys.

All of these individuals have made an informed decision to participate in the study over the years and have been made aware that the study is seeking to follow them throughout their lives. This information is provided to participants on the study website under ‘How we find you’ https://bcs70.info/faqs/#keeping-in-touch# ‘Do you use information held by Government to find us?'. CLS provide a link to the information on the study website in all materials provided to cohort members. Cohort members receive an advance booklet with complete information about each upcoming survey.

Each year CLS sends an annual postal mailing to all BCS70 participants. CLS asks that participants complete a reply slip which is returned to CLS which allows participants to provide CLS with any change in their details e.g., a new email address, phone number, etc. CLS also ask them to return the reply slip even if none of their details have changed i.e., seeking a positive confirmation that that is the address CLS hold for them. As a result, CLS can maintain the cohorts' latest details on the BCS70 database. In the event of the annual mailing not reaching the participant it is returned to CLS as a 'return to sender'. CLS will attempt to trace all these returns but if CLS cannot locate the participants then they are flagged on the database as a 'gone-away'. NHS Digital may potentially hold a more recent address and provide CLS with an opportunity to invite the cohort to re-join the study. CLS will send the details of approximately 12,000 cohort members to be linked to the NHS Digital Personal Demographics Service (PDS) dataset. This number excludes those who have died and those who have requested to be withdrawn from the study. NHS Digital will supply new addresses for study members who can be matched to the PDS dataset.
CLS have contracted an external supplier NatCen Social Research (the trading name of the National Centre for Social Research) to carry out the individual study members' interviews. NatCen was commissioned to run interviews with study members for the Age 51 Survey. NatCen have also contracted an additional Data Processor, Kantar Public, to assist with the interviews.

CLS intend to use the new addresses to invite participants to continue participating in the study. This will include sharing addresses with NatCen and Kantar so that participants can be invited to take part in the current Age 51 Survey. Further details about exactly how addresses will be used are provided below in Section 5b.

If a study member makes clear that they do not wish to take part in the study this is flagged on the CLS database with a code denoting whether their refusal is temporary (i.e. for a particular wave/survey) or permanent (i.e. they wish to have no further involvement in the study). Any previously deposited anonymised survey data for a study member and confidential data from the address database are retained unless the study member specifically asks us not to, in which case this data is securely deleted.

With regard to a request for 'withdrawal' from a participant CLS classifies them as a 'withdrawal from the current survey' or a 'withdrawal from the study' and these are handled slightly differently:
(a) Withdrawal from the current survey: CLS will flag this on its computer system to indicate that the participant will not be taking part in the current survey and the reason for not wanting to take part is also recorded. For example, they may just not have the time to take part. Therefore, there will be no further contact with the participant for the duration of the current survey, but they will be invited to take part in the next survey.
(b) Withdrawal from the study: CLS will flag this on its computer system as a permanent refusal to indicate that the participant will not be taking any further part in the study itself and the reason for this type of withdrawal is also recorded for analysis purposes. Therefore, there will be no further contact with the participant for the remainder of the longitudinal study. If this request is received in writing, then CLS will acknowledge the request and notify the participant that they have been flagged and will no longer be contacted or receive any further communications. This request may sometimes be accompanied by a request for the destruction of their data.

As University College London (UCL) determines the purposes and means of personal data processing under this agreement, they are the sole Data Controller who will also process data. Natcen and Kantar Public are Data Processors.

UCL's legal basis for processing (acquiring, linking and sharing) personal data is for a public task under GDPR Article 6(1)(e) i.e. processing is necessary for the performance of a task carried out in the public interest (as is made explicit to participants in the information leaflets provided). UCL also process special categories of personal data for research under GDPR Article 9(2)(j) i.e. processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. In addition, for ethical reasons and under the Common Law Duty of Confidentiality, UCL sought permission from the Confidentiality Advisory Group to access this data without consent. CLS has also received Research Ethical Committee (REC) approval for tracing participants via NHS Digital.

Research using the data from this study will contribute to a body of evidence which have the potential to inform government and result in policy change in various areas including health. As a result, this has the potential to benefit the public in general.

Participants are aware that the study will attempt to trace them and CLS are confident that many of those newly traced via the NHS will be happy to take part. CLS also uses its own methods for tracing cohort members for example, asking the named stable contacts (relative, neighbour or friend) for the participant’s new address.

The Economic and Social Research Council (ESRC) are the funder for this study. No data received under this agreement will be shared with the funder.

Yielded Benefits:

In the BCS70 Age 42 Survey which was conducted in 2012 UCL CLS achieved almost 800 interviews with participants who had been newly traced to an address supplied by NHS Digital – and this included almost 600 interviews with study members who had not previously taken part in the study for over 10 years. This shows very clearly the value of tracing participants in this way. The addresses obtained in previous versions of this agreement were also very useful to invite study members (BCS70 age 46 sweep) to take part and re-engage with the studies. Using addresses provided by the NHS Digital helped CLS get in touch with those cohort members who would otherwise not be able to take part in a new survey. The BCS46 data is now available for researchers to access via the UK Data Service, providing an important resource for UK Social Science, including researchers in health and social care. A link to the dataset is provided here https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=8547

Expected Benefits:

The continuing success of the study will be underpinned by the successful matching of untraced cases. Submitting the cohort for up to date contact details and being able to keep the information for future sweeps, will allow the researchers to re-contact the participants who CLS have lost touch with and give them the opportunity to re-engage or clearly state that they wish to withdraw. It will also ensure that literature goes to the correct name and address. It will also ensure that no contact will be made with participants who have died.

The information collected during the Age 46 Survey and in the future sweeps age 51 and 54 may enable researchers to uncover life course and inter-generational factors which contribute to healthy ageing among this generation, and thus to inform the development of preventative health policies across the whole of life that will expand healthy life expectancy, and reduce the burden of ill-health and disease at older ages.

The BCS46 data is now available for researchers to access via the UK Data Service , providing an important resource for UK Social Science, including researchers in health and social care. This Sweep involved many data collection elements, including a full range of bio-measures administered by a nurse. The inclusion of objective measures of health will allow researchers to assess the longitudinal predictors of health in midlife. Many of the measures were included in CLS’ other study the NCDS age 44 biomedical sweep, which will allow for cross cohort comparisons. A link to the dataset is provided here https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=8547

The use of the data may result in papers that will be published, presented at conferences and sometimes reach media coverage. Most papers will contribute to a body of evidence which will result in improvements to health care users experience or health care delivery. It is expected that occasionally, these may have a higher impact such as the examples highlighted below:

- For example the Welsh Government policy on early years planning - http://www.closer.ac.uk/news-opinion/2013/welsh-governments-early-years-childcare-plan-draws-evidence/

-Encouraging reading for pleasure and children’s cognitive development.
Research using data from the 1970 British Cohort Study (BCS70) has revealed how reading for pleasure can help children excel not only in English but also in maths. This important work, led by CLS, has had a big influence on reading for pleasure programmes, policies and practice in the UK and beyond, benefitting millions of children worldwide. The link between reading for pleasure and children’s maths and vocabulary scores was covered extensively in the media, including in articles in the Daily Telegraph, Sydney Morning Herald and Vancouver Sun and in interviews for BBC Radio 4’s Today Programme, BBC London and Al Jazeera. The findings attracted a remarkable amount of interest from schools, libraries and literacy organisations around the world. They have been used to help protect library services, to persuade children of all ages to spend more time reading, and to encourage parents to support schools’ home reading initiatives. In the UK, the research was cited in a 2015 Department for Education report, ‘Reading: the next steps’, underpinning recommendations for government funding to support book clubs, resources for reading, and instructing schools to promote library membership.
Selected coverage:
The Guardian – ‘Reading for fun improves children’s brains, study confirms’
Daily Telegraph – ‘Reading for pleasure ‘boosts pupils’ results in maths’
Vancouver Sun – ‘ Libraries are worthwhile public investment’
Sydney Morning Herald – ‘Reading gives kids an edge, study says’

Outputs:

The addresses supplied by NHS Digital under this agreement will help to boost the sample size and to increase the data collection at age 51. Prior to a new survey collection, there are always a number of people who CLS will have lost contact between the previous surveys and the new upcoming survey. The output of the demographic data will be the participation in the survey, for those participants who wouldn’t otherwise have participated in the age 51. In the BCS70 Age 42 Survey which was conducted in 2012 we achieved almost 800 interviews with participants who had been newly traced to an address supplied by NHS Digital – and this included almost 600 interviews with study members who had not previously taken part in the study for over 10 years. This shows very clearly the value of tracing participants in this way.
The main outcome for the study is the next sweep, Age 51 currently underway and scheduled to complete in 2022. This will be a fully documented, anonymised research dataset which will be archived with the UK Data Service to provide a strategically important resource for UK Social Science, including researchers in health and social care. For clarity, the data stored on UK Data Service does not include NHS Digital data but responses from participants of the study.

The scientific priorities and questionnaire content of the survey were elaborated and developed in consultation with the academic and policy community with the aim of collecting both information relevant to their lives at age 51 and to later life outcomes, as well as repeat measures of topics covered at age 51. CLS will continue to prospectively harmonise the content with other comparable cohorts, particularly those in the UK, by drawing on comparable measures at a similar age. All surveys are overseen by the CLS Strategic Advisory Board (SAB) which contains representatives from UK research and Innovation (UKRI), Wellcome Trust, Medical Research Council, the scientific community, and government departments. The SAB provide high level strategic oversight for CLS to ensure the cohort studies led by the centre are developed, managed, and maintained in a manner that maximises their benefit as long-term scientific resources of importance both nationally and internationally, while protecting participants' interests. The SAB ensure that the content is closely aligned with research priorities, as well as the areas of research interest (ARIs) published by government departments. CLS reflect these priorities when deciding on the major themes which CLS intend to cover at the sweep.

In additional to the creation of this rich database which will be BCS70 Age 51, CLS will continue to produce outputs from the study via the UK Data Service in the form of aggregated reports with small numbers suppressed, for the benefit of the wider research community, as previous interest in BCS70 data has proven to be sought in a large scope of research areas.

CLS will also publish papers in a range of journals; however, it is not possible to provide detail at this point as to precisely which journals and dates, but the intention is to produce outputs along the same lines as those produced after the previous sweep. All scientific papers using the BCS70 data are published on the CLS Bibliography page online.
https://www.bibliography.cls.ucl.ac.uk/Bibliography.aspx?sitesectionid=647&sitesectiontitle=Bibliography

Processing:

NHS address tracing and matching variables:
NHS address tracing. CLS wish to use the demographics data to receive new updated address regularly (annually) for the BCS70 cohort members.

CLS will supply NHS Digital with a file of approximately 12,000 study members to match to NHS data. The file supplied will only contain eligible study members who have participated in at least one wave of BCS70. It will not include study members known to have died or to have withdrawn from the study. The file will contain the following identifying data items:
- CLS identifier
- First name
- Last name
- Middle name (where available),
- Date of birth
- Sex
- Last known address, and postcode
- NHS Number

NHS Digital will match the cohort identifiers to the demographics data and then supply the following details to CLS:
- CLS identifier
- NHS number
- Requested fields from Demographics dataset.

No other data will be linked to the NHS Digital data received.

NHS number is very useful in determining if a cohort is the correct cohort member, where there are people with the same name as the cohort member in the NHS database and the address CLS receive is different to the address CLS hold in CLS’s database, this means that CLS will use the full name and the NHS number to ensure that the new address NHS Digital have sent CLS is for the right person. CLS don’t use the NHS number for anything else other than for validation. CLS will also use NHS Number when in future CLS need to send NHS Digital a file for linkage, CLS use these to send a matching file to NHS Digital so that NHS Digital can more easily match CLS’s cohort members in NHS Digital’s database.

All those accessing the data supplied by NHS Digital are substantive employees of University College London or employees of the processor organisations (Natcen and Kantar ) carrying work on behalf of UCL who have been appropriately trained in data protection and confidentiality.

At UCL, the NHS Digital data will be held securely at the UCL Data Safe Haven (DSH) and accessed remotely by CLS staff. The UCL DSH is certified to ISO 27001:2013 and is compliant with NHS Digital’s Data Security and Protection Toolkit. Staff using the DSH complete annual training and regularly review data access arrangements ensuring data are only limited to those authorised to access it. UCL Computing Regulations are based on the premise that access to resources is generally forbidden unless expressly permitted. All data transfers from the DSH require approval and are carried out through secure portals which are fully audited. Access to the UCL DSH is via remote desktop and requires multi-factor authentication. In addition to a strong password each user has to use a six-digit number generated by a smartphone app or physical token at each login. Passwords must be changed at regular intervals, and unused accounts are automatically disabled after a fixed period. Once inside the environment, robust access control ensures that researchers can only examine information that they are approved to use.

The data file supplied by NHS Digital, will be reviewed by CLS where CLS will make a judgement as to whether each address should be considered ‘new’. The decision as to whether to regard an address provided by the NHS as ‘new’ will be made as follows:

• Is the NHS address the same as the current address held on the study database, or the same as a historical address held in our database – which we have previously established is no longer the address of the study member? If so, the NHS address will not be uploaded.
• Has the current address on our database been confirmed in a recent survey or via some other way? If so, the current address will be retained and the NHS address will not be uploaded.
• Is the date associated with the NHS address more recent than the date at which the current address on our database was most recently confirmed? If not, the current address will be retained and the NHS address will not be uploaded.
• The NHS address will therefore be regarded as a ‘new’ address if a) we do not already have a recently confirmed address on our database, b) the NHS address is more recent than the address on our database and c) the NHS address is not the same as an existing address on our database. Addresses which are considered to be new will be uploaded into our database.

At the outset of any data collection project, CLS sends a ‘sample’ file to the fieldwork agency which contains the latest contact details for all study members who are to be invited to take part. This information includes name, sex, date of birth, addresses, telephone numbers and email addresses.

The individuals selected to take part will be all those who have taken part in at least one recent survey (last three sweeps) and all those who have not taken part in a recent survey but where we have recently obtained new contact details. NatCen will send a letter (and email) to all study members on behalf of CLS, which will invite them to participate in the forthcoming survey and will let study members know that an interviewer from NatCen or from Kantar Public will be making contact with them soon. NatCen will allocate half of the study members to be contacted and interviewed by Kantar Public interviewers (because NatCen have sub-contracted 50% of interviewing to Kantar Public because of capacity constraints). NatCen will send the names and addresses of these cases to Kantar Public in order that they can allocate study members to their interviewers. NatCen interviewers and Kantar Public interviewers will both gain access to the names and addresses via NatCen systems.

Kantar Public interviewers will access NatCen systems via a Virtual Machine Network.


Invitations are sent out – and then shortly after, interviewers will make contact with participants via telephone and via personal visits to the address.

Ideally, this NHS tracing exercise would have been completed prior to the commencement of the current fieldwork so that the NHS addresses could have been supplied to NatCen in the original sample file. This would have allowed invitation mailings to be sent to the new addresses.

Fieldwork on the current survey is being conducted in ‘waves’. In total around 12,000 will be invited to take part – this will happen in waves of around 2,000 which are spaced several months apart.

We have so far issued 3 waves of fieldwork – so around half of the total. For these waves, postal invitation mailings were sent to the latest addresses held in our database and interviewers have been trying to reach these cases in order to see if participants are willing to take part and to conduct interviews if so.

Because fieldwork has already started, the new NHS addresses will be provided to NatCen as an ‘update’ to the original sample file.

The way in which the new NHS addresses will be used will depend on whether the individual has already been invited to take part and whether they have been located.

If an individual has already been sent their invitation, and the interviewer has subsequently located them, then the NHS address will not be used.

If an individual has already been sent their invitation, but the interviewer has not been able to locate them – then a new invitation letter will be sent to the new NHS address and the interviewer will then attempt to contact the individual at that address.

If an individual is allocated to a wave that has not yet commenced then the initial invitation mailing will be sent to the new NHS address – and the interviewer will then attempt to contact the individual at that address.

If a letter that is sent to the new NHS address is returned to sender, or the interviewer is unable to make contact with the participant at that address, then interviewers will try to locate the participant elsewhere, and failing this the individual will be classified as ‘untraced’ and marked as such on the CLS database. If the interviewer makes contact with a participant – but the participant makes clear that they no longer wish to take part in the study, the case will be marked as a permanent refusal on the CLS database and will not be invited to take part in any future studies.
If a new address is obtained for a participant who was not included in the original sample file provided to NatCen it will not now be possible to invite that participant to take part in the Age 51 Survey. The address would still be uploaded to the CLS database. CLS send an annual mailing to all study members each April in their birthday week. The new addresses for cases not already provided to NatCen will be used to send this birthday mailing.


At the end of each interview, names, addresses and other contact details will be confirmed or updated on NatCen systems prior to being returned to CLS.

All personal information provided to or collected by the fieldwork agencies will then be destroyed on completion of their contracts. Kantar Public do not hold any data on their own systems. Access to NHS data by Kantar Public is strictly through a NatCen Virtual Machine Network used to look up participant data.


Millennium Cohort Study (also known as Child of the New Century) - Tracing — DARS-NIC-408892-F1R1Y

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable, Yes (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2021-07-06 — 2024-07-05 2021.12 — 2023.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Demographics

Objectives:

The Centre for Longitudinal Studies (CLS) at University College London (UCL) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. It is responsible for four of Britain's internationally renowned longitudinal cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study, the Next Steps, and the Millennium Cohort Study (MCS). All these studies are following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being.

The purpose of this application covers: Update CLS' database with new addresses and deaths - CLS is requesting up to date addresses for the MCS cohort members. The new addresses will be used to invite participants to take part in the upcoming survey. Date of Death is requested to ensure that ‘untraced’ cohort members are not contacted if they have died (although deaths are provided under agreement DARS-NIC-147860-0RSHN-v1.3 they would be disseminated at a different time to the new addresses and deaths under this application.)

The Millennium Cohort Study (MCS), also known as the ‘Child of the New Century’ to cohort members and their families, is following the lives of around 19000 young people born across England, Scotland, Wales and Northern Ireland in 2000-02.

MCS is renowned worldwide for the evidence it provides on children’s experience of growing up in the United Kingdom in the 21st Century. Since the study’s launch there have been seven attempts to re-contact and gather information from the whole cohort (at ages 9 months, 3 years, 5 years, 7 years, 11 years, 14 years and 17 years). The MCS covers such diverse topics as parenting; childcare; schooling and education (e.g academic qualifications, vocational qualifications); daily activities and behaviour; cognitive development; child and parent mental and physical health; employment and education; income and poverty; housing, neighbourhood, and residential mobility; and social capital, ethnicity and identity. The information collected in previous sweeps of the study has formed the high-quality data resource, that is MCS, for scientific investigation across the life course and domains. The seventh, Age 17 survey (2018-19) added to the data already collected in previous sweeps by updating information on current circumstances of the cohort and experiences they have had since the last sweep. In previous sweeps, schooling will have been the main activity common to the vast majority of cohort members.

The Age 17 survey marked an important transitional time in the cohort members’ lives, where educational and occupational paths can diverge significantly. It is also an important age in data collection terms since it may be the last sweep at which parents are interviewed and it is an age when direct engagement with the cohort members themselves rather than their families is crucial to the long-term viability of the study. To reflect this, CLS conducted face to face interviews with the cohort members for the first time. Cohort members were also asked to do a range of other activities including filling in a self-completion questionnaire on the interviewer’s tablet, completing a cognitive assessment (number activity) and having their height weight and body fat measurements taken.

This was a unique opportunity to measure factors that underlie different types of transition into adult life, which may affect future wellbeing in unprecedented ways. Capturing these transitions well, alongside the contemporary factors underlying them was critical. It was important to build up a picture of daily life, including factors such as: relationship with parents, family and peers, risky behaviours, social media engagement and efforts on activities such as education /school. Additional factors affecting decisions at this age include attitudes and preferences, such as preferences for education, attitudes to risk, willingness to trade off resources at different points in time, and expectations about future life events. Measuring social and emotional development, mental health and cognitive development and using well-validated instruments, was also a critical component of the survey.

Of the approximately 19,000 individuals that have ever participated in the study there will always be a number of individuals for whom the Centre for Longitudinal Studies (CLS) at University College London will not have confirmed addresses at the time of carrying out the next survey.

The ongoing success of the study depends on maintaining contact with as large a number of study members as possible. Therefore, CLS are seeking permission to be supplied with updated addresses for MCS cohort members. All of these individuals have made an informed decision to participate in the study over the years and have been made aware that the study is seeking to follow them throughout their lives. This information is provided to participants on the study website, a link is provided at website CNC | FAQs (childnc.net), under ‘How we find you’. CLS provide a link to the information on the study website in all materials provided to cohort members. Cohort members receive an advance booklet with complete information about each upcoming survey.

Each year CLS sends an annual postal mailing to all MCS participants. CLS asks that participants complete a reply slip which is returned to CLS which allows participants to provide CLS with any change in their details e.g., a new email address, phone number, etc. CLS also ask them to return the reply slip even if none of their details have changed i.e., seeking a positive confirmation that that is the address CLS hold for them. As a result CLS, can maintain the cohorts' latest details on the MCS database. In the event of the annual mailing not reaching the participant it is returned to CLS as a 'return to sender'. CLS will attempt to trace all these returns but if CLS cannot locate the participants then they are flagged on the database as a 'gone-away'. NHS Digital may potentially hold a more recent address and provide CLS with an opportunity to invite the cohort to re-join the study. CLS will send the details of approximately 14,100 cohort members to be linked to the NHS Digital Personal Demographics Service (PDS) dataset. This number excludes those who have died and those who have requested to be withdrawn from the study. NHS Digital will supply new addresses for study members who can be matched to the PDS dataset. Any study members for whom CLS successfully received a new address via this route would be written to and asked to provide updated contact details.

CLS will appoint an external supplier agency (data processor) to carry out the next survey interviews for age 22 which is currently planned to take place in 2023. CLS will also use a Mailing house supplier to send correspondence to participants inviting them to re-engage with the study. CLS intend to share new addresses received from NHS Digital with these organisations in order for them to invite study members to take part in future surveys. Once the agency is appointed, CLS will inform NHS Digital of the new data processors.

Any study member choosing not to take part in the study are flagged on this CLS database with a code denoting whether their refusal is temporary (i.e. for a particular wave/survey) or permanent (i.e. they wish to have no further involvement in the study). Any previously deposited anonymised survey data for a study member and confidential data from the address database are retained unless the study member specifically asks us not to, in which case this data is securely deleted.

With regard to a request for 'withdrawal' from a participant CLS classifies them as a 'withdrawal from the current survey' or a 'withdrawal from the study' and these are handled slightly differently:
(a) Withdrawal from the current survey: CLS will flag this on its computer system to indicate that the participant will not be taking part in the current survey and the reason for not wanting to take part is also recorded. For example, they may just not have the time to take part. Therefore, there will be no further contact with the participant for the duration of the current survey but they will be invited to take part in the next survey.
(b) Withdrawal from the study: CLS will flag this on its computer system as a permanent refusal to indicate that the participant will not be taking any further part in the study itself and the reason for this type of withdrawal is also recorded for analysis purposes. Therefore there will be no further contact with the participant for the remainder of the longitudinal study. If this request is received in writing then CLS will acknowledge the request and notify the participant that they have been flagged and will no longer be contacted or receive any further communications. This request may sometimes be accompanied by a request for the destruction of their data.

University College London (UCL) are the sole Data Controller for this agreement who will also process data.

UCL legal basis for processing (acquiring, linking and sharing) personal data is for a public task under GDPR (article 6(1)(e)) i.e. processing is necessary for the performance of a task carried out in the public interest (as is made explicit to participants in the information leaflets provided). UCL also process special categories of personal data for research under GDPR (article 9(2)(j)) i.e. processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. In addition, for ethical reasons and under the Common Law Duty of Confidentiality, UCL sought permission from the Confidentiality Advisory Group to access this data without consent. CLS has also received Research Ethical Committee (REC) approval for tracing participants via NHS Digital.

Participants are aware that the study will attempt to trace them and CLS are confident that many of those newly traced via the NHS will be happy to take part. CLS also uses its own methods for tracing cohort members for example, asking the named stable contacts (relative, neighbour or friend) for the participant’s new address.

The Economic and Social Research Council (ESRC) are the funder for this study.

Yielded Benefits:

Expected Benefits:

The study produces rich, longitudinal, policy-relevant data, currently unavailable elsewhere, for a large, representative sample of children/young adults. MCS data is widely used by policy makers to evaluate and develop policy and improve services for young people and also by academic researchers to chart and understand social change. The information provided by cohort members provides valuable evidence for the research and policy community about the cohort's transitions to education/work and into early adult life. To enhance the research resource for secondary users, a fully documented, pseudonymised dataset collected at age 17 was archived at the UK Data Service.

A specific benefit of the data dissemination under this agreement is being able to trace cohort members, using this data ensures the study sample is maintained and the study remains representative of the studied population.

Below are examples of existing publications using the MCS data benefiting public health.

Drinking in pregnancy
In the age 3 survey, MCS cohort children completed activities to show which words they understood and spoke, and which colours, letters, numbers, shapes and objects they were familiar with. Parents were also asked about different aspects of children's behaviour, such as how well they got on with other children and how active they were. Research using MCS survey data have found that children whose mothers drank heavily while they were pregnant were more likely to have behaviour problems at age 3 than those whose mothers did not drink or drank lightly. On average they also did less well in the different activities, although lots of other factors are also important too.

Smoking in pregnancy
Several studies based on MCS have looked at how smoking during pregnancy relates to children's development. One group of researchers found that babies with mothers who smoked at any point while they were pregnant weighed on average 146 grams less when they were born (around the weight of a smartphone) than babies with mums who did not smoke. Overall, the more cigarettes a mother smoked a day, the less her baby weighed at birth. Babies with mothers whose partners smoked around them while they were pregnant also weighed on average 36 grams less (about the weight of a chocolate bar) than those with mothers who were not exposed to smoke.

Breastfeeding and child health
An influential study found that babies who were breastfed in the first months of their lives were less likely to go to hospital for diarrhoea or respiratory problems, such as infections and pneumonia. The researchers estimated that half of hospital stays for diarrhoea, and a quarter of stays for respiratory problems, could be prevented every month if all babies in the UK were fed entirely on breast milk for at least six months.

Breastfeeding and child development
Between ages 3 and 7 MCS children took part in a range of activities to show which words they knew and the patterns they could identify in shapes and images. Studies have found that children who were breastfed tended to do better in these exercises and to have less behaviour problems. Research has also suggested that there is a relationship between breastfeeding and young children's ability to coordinate the movements of their arms and legs and to reach milestones such as standing up for the first time and taking their first steps.

The paragraph below is taken from this 2011 impact case study - https://cls.ucl.ac.uk/wp-content/uploads/2017/06/Impact-case-studies-Millennium-Cohort-Study-November-2011.pdf

Breastfeeding and birth weight: The MCS research that found breastfeeding to be associated with lower hospitalisation rates for respiratory infections and child diarrhoea has proved to be very influential. It has been widely cited by health organisations, most notably in:

• the National Institute for Health and Clinical Excellence (NICE) guidance on Maternal and Child Nutrition;
• guidance issued by the Department of Health/Department for Children, Schools and Families, ‘Commissioning local breastfeeding support services’;
• ‘Infant Feeding Survey 2005: A commentary on infant feeding practices in the UK’, by the Scientific Advisory Committee on Nutrition.

The finding is highlighted in the nutrition guidelines and breastfeeding strategy documents published by many UK primary care trusts, including North Somerset, Stoke on Trent, Blaenau, Gwent, North Lincolnshire, Knowsley and Kent and Medway. It is also cited in documents published by the NCT (formerly the National Childbirth Trust), such as NCT breastfeeding support services - the evidence (2010). This finding has, additionally, had an impact far beyond the UK. It has been used to help underpin the South African government’s policy on breastfeeding (see ‘SA Breastfeeding Program: Strategic and action plan 2007 – 2012’). It is also referred to in several documents and public statements issued by Unicef UK on behalf of the Baby Friendly Initiative, a worldwide programme of the World Health Organization and Unicef.

Thanks to MCS and this research, mothers have more information and guidance about the health benefits of breastfeeding for their children.

Outputs:

Output from the data received from NHS Digital
The addresses previously obtained from NHS Digital, for other studies, were used to invite study members to take part and re-engage with the studies and will be used for this study to invite participants to take part in the upcoming survey. Using addresses provided by the NHS Digital helped CLS getting in touch with those cohort members who would otherwise not be able to take part in a new survey.

The main outcome for the study is the next sweep, MCS age 22, provisionally planned to take place in 2023, which will be a fully documented, anonymised research dataset and this will be archived with the UK Data Service to provide a strategically important resource for UK Social Science, including researchers in health and social care.

The scientific priorities and questionnaire content of the next sweep (age 22) will be elaborated and developed in consultation with the academic and policy community with the aim of collecting both information relevant to their lives at age 22 and to later life outcomes, as well as repeat measures of topics covered at age 17. CLS will continue to prospectively harmonise the content with other comparable cohorts, particularly those in the UK, by drawing on comparable measures at a similar age. All surveys are overseen by the CLS Strategic Advisory Board (SAB) which contains representatives from UKRI, Wellcome Trust, Medical Research Council, the scientific community, and government departments. The SAB provide high level strategic oversight for CLS to ensure the cohort studies led by the centre are developed, managed, and maintained in a manner that maximises their benefit as long-term scientific resources of importance both nationally and internationally, while protecting participants' interests. The SAB ensure that the content is closely aligned with research priorities, as well as the areas of research interest (ARIs) published by government departments. CLS will reflect these priorities when deciding on the major themes which CLS intend to cover at the next sweep collection at age 22 which will certainly include health themes such as Mental health and wellbeing, including psychological distress and anxiety, mental wellbeing, life satisfaction, loneliness, coping mechanisms. Physical health and health behaviours, including weight, substance use, sleep, diet and exercise will also be included.

Research using this data often feature in the news, potentially reaching policy making communities in this way. For example https://www.theguardian.com/society/2020/sep/17/children-living-in-more-costly-homes-have-fewer-mental-health-problems-study. Initial findings from the survey are shared on the study website. https://childnc.net/initial-findings-from-the-age-17-survey/ . This is also shared with participants. Similar outputs are expected for the current project. This will encourage engagement with the public, the scientific and the policy-making communities.

CLS will continue to produce outputs from the study via the UK Data Service in the form of aggregated report for the benefit of the wider research community as previous interest in MCS data has proven to be sought in a large scope of research areas. CLS will also publish papers in a range of journals; however, it is not possible to provide detail at this point as to precisely which journals and dates, but the intention is to produce outputs along the same lines as those produced after previous sweep. All scientific papers using the MCS data are published on the CLS Bibliography page online.
https://www.bibliography.cls.ucl.ac.uk/Bibliography.aspx?sitesectionid=647&sitesectiontitle=Bibliography

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by "Personnel" (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).

NHS address tracing and matching variables:
NHS address tracing. CLS wish to use the demographics products to receive new updated address regularly.

CLS will supply NHS Digital with a file with study members to match to the NHS data. The file supplied will only contain eligible study members who have participated in at least one wave of MCS. It will not include study members known to have died or to have withdrawn from the study. The file will contain the following data items:
- CLS identifier
- First name
- Last name
- Middle name (where available),
- Date of birth
- Gender
- Postcode
- NHS Number

NHS Digital would supply the following details to CLS:
- CLS identifier
- NHS number
- Data as stated in the Demographics extract.

No other data will be linked to the NHS Digital data received.

All those accessing the data supplied by NHS Digital are substantive employees of University College London or employees of subcontractor organisations (organisation to be appointed) carrying work on behalf of UCL who have been appropriately trained in data protection and confidentiality.

At UCL, the NHS Digital data will be held at the secure server in the UCL Data Safe Haven (DSH) and accessed remotely by CLS staff. The UCL DSH is certified to ISO 27001:2013 and is compliant with NHS Digital’ s Data Security and Protection Toolkit. Staff using the DSH complete annual training and regularly review data access arrangements ensuring data are only limited to those authorised to access it. UCL Computing Regulations are based on the premise that access to resources are generally forbidden unless expressly permitted. All data transfers from the DSH require approval and are carried out through secure portals which are fully audited. Access to the UCL DSH is via remote desktop and requires multi-factor authentication. In addition to a strong password each user has to use a six-digit number generated by a smartphone app or physical token at each login. Passwords must be changed at regular intervals, and unused accounts are automatically disabled after a fixed period. Once inside the environment, robust access control ensures that researchers can only examine information that they are approved to use.

The data file supplied by NHS Digital, will be reviewed by CLS. Where addresses supplied by NHS Digital are new or more recent than the address currently held on the CLS confidential database the new addresses will be uploaded. CLS will write to all newly traced cohort members at the addresses that are supplied by NHS Digital and will ask them to confirm their address by return of a reply slip, telephone, email or via our website. For this purpose, CLS will send names and addresses to the mailing house company (to be appointed) so they can send correspondence to cohort members on behalf of CLS. If cohort members confirm their address this will be recorded on CLS' database as a confirmed address. If the letter is ‘returned to sender’ this will be also be recorded on CLS’ database. There will also be cases where no confirmation is received and CLS’ letter is not returned to sender.

CLS will use a mailing house organisation (to be appointed) to send correspondence to participants inviting them to re-engage with the study. Furthermore, addresses will be used by CLS to invite study members to take part in the current survey and future surveys. These new/more recent addresses will also be shared with the fieldwork agency (to be appointed) for the purpose of inviting participants to take part in the upcoming survey. Once the mailing organisation is appointed, CLS will submit an amendment to add the new processor to this application. Only after receiving NHS Digital approval of the new processor will CLS share any data with them.


Linkage of NHS Digital data to young people with perinatal HIV, to monitor cancers and deaths. — DARS-NIC-368477-C9Q1X

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, No, Yes, Identifiable (Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'; Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 – s261(2)(c)

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-02-14 — 2025-02-13 2022.04 — 2023.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Cancer Registration Data
  2. Civil Registration - Deaths
  3. Demographics
  4. Civil Registrations of Death

Expected Benefits:

This study hopes to provide evidence on health outcomes in early adulthood and to provide the foundation for long-term monitoring. The linkage to NHS Digital, for cancer and mortality data will be critical to estimate the risk of disease progression, hospitalisation and mortality and - it is hoped - will help to tailor future HIV care accordingly, either to help diagnose cancers earlier, or to prevent cancers and deaths from occurring. Improved data on critical health risks such as cancers and deaths are of public interest to government agencies such as the NHS and UK Health Security Agency (UKHSA), to quantify the burden of PHIV-related ill health and to tailor prevention and treatment.

The outputs of this study aim to either give reassurance that there is no additional risk of cancer or mortality among young people with PHIV, or to indicate what the increased risk is, and in which if any subgroups. Should increased risk be detected, results (through conference presentation and also review by guideline committees) are likely to change clinical guidelines both in the UK and Europe, and internationally, on the management of young people with PHIV. The aim would be for increased healthcare resources to be available for this group in order to maximise health benefits in the future. It is hoped that this will be achieved through sharing publications with the UK Health Security Agency and NHS England.

The MRC CTU at UCL aims to conduct analyses of these data to inform this overall aim. The study team aim to analyse incidence of cancers and deaths by age in young people living with PHIV, controlling for potential confounding factors.

Outputs:

The results of this study are expected to be published in high-impact peer reviewed journals such as Clinical Infectious Diseases or AIDS Care to give the highest impact and broadest readership. Papers are aimed to be published based on UCL open access policy. This would include publication in open access journals, and summaries of results may be made available on the MRC CTU on the UCL website which is freely open to the public.

Outputs will be anonymised to the level required by the Information Standards Board for Health and Social Care (ISB) anonymisation standard and will contain aggregate and suppressed data (according to the HES analysis guide) only.

In addition, it is hoped that the results will be presented at scientific conferences and professional meetings related to HIV. The conferences will be chosen depending on the key findings and also the target audience. Possible conferences include the Children's HIV Association (CHIVA) and the British HIV Association (BHIVA) conferences. This would allow for coverage of key stakeholders in the setting of registry data and clinicians involved in HIV research, as well as young people themselves and their families.

The study team have already produced a leaflet about CHIPS and CHIPS+, targeted at young people themselves, to help the recruitment of young people into the study so the study team plan to produce another leaflet about the findings of CHIPS+. The study team found this worked really well in the Adolescents and Adults Living with HIV Cohort Study (AALPHI) where the study team engaged young people in a project on the dissemination of the findings and they developed a leaflet and film with a graphic designer and film maker. It is hoped that the leaflet will be sent out to participating clinics, and put on the Children’s HIV Association (CHIVA) website. CHIVA is a registered charity based in Bristol which, among other things, aims to enhance the health and social outcomes of children, young people and families living with HIV https://www.chiva.org.uk/about/]

The Youth Trials Board, which is part of CHIVA and made up of nine young people who have some training on clinical trials are going to help develop the study dissemination strategy targeting young people. They have suggested using social media to disseminate the results using their twitter chat with @freedom2speak and their Instagram content is being developed in 2022 and could be also used as a platform to share results with young people. They could also do presentations at CHIVA events including CHIVA camp which happens every summer.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract).

The processing activities are as follows:

1. The CHIPS+ Study team will identify the study participants for linkage to NHS Digital data, and inform the UCL MRC CTU Head of Data Management Systems (DMS). The team will provide the Study ID, Date of Birth, initials and sex to the Head of DMS.

2. The UCL MRC CTU Head of DMS has a secure database of Patient Identifiable Data (PID) in UCL’s Data Safe Haven; and will merge the PID with a list of Study IDs.

3. The resultant study cohort of participants (approx. 750 records) will be sent by the UCL MRC CTU Head of DMS to NHS Digital via Secure Electronic File Transfer Service (SEFT) with the following identifiers for linkage to the requested NHS Digital data products (Demographics data extract, Cancer Registrations data extract, and Civil Registration (Deaths) data extracts):
• Study ID (CHIPS+ participant identifier)
• NHS Number
• Date of birth
• Date of withdrawal (if applicable)
Additionally, those participants who have been recruited under a consultee will be flagged and will have National Data Opt Out applied to them.

Patients have consented for data to be shared with researchers in an anonymised form, which will be aggregated with small numbers suppressed as per the HES analysis guidance. They have also agreed that personal details can be used to obtain long term follow up information from national registries.

4. NHS Digital will use the cohort identifiers to extract linked data from the requested data products, including the full date and cause of death, cumulative from cohort member's date of first recruitment (unless subsequently withdrawn from the study).

5. The pseudonymised record-level NHS Digital datasets with the Study ID will be sent to the MRC CTU at UCL using SEFT. There will be three drops of pseudonymised record-level NHS Digital data during the period of this agreement.

6. NHS Digital records will be uploaded to UCL’s Data Safe Haven, an output file for trial statisticians with the study-specific trial number will be prepared, and checked to ensure there is no PID within the file.

7. This output file is placed in a secure directory with limited access only to certain members of CHIPS+ study team.

8. Trial statisticians undertake data cleaning/validation activities for the processing of the data for the CHIPS+ study.

The requested pseudonymised record-level data from NHS Digital will be used to ascertain the incidence of cancers and mortality in young people with PHIV from the CHIPS+ cohort. The data from NHS Digital will allow the MRC CTU at UCL to ensure that deaths are captured and included promptly by all centres and clinics involved in the trial, and to verify that all information is correct and recorded. The number of participants and the rationale for their inclusion will always be included in all presented results.

No linkage will be made to any other data set not already stated in this agreement.

The data will reside in UCL’s Data Safe Haven and will be identified by Study ID only, thus there will be no identifying personal data attached to a study number. Only defined members of the CHIPS+ study team will have access to Data Safe Haven for data analysis – all are substantive employees of UCL. All UCL substantive employees have completed training in data protection and confidentiality, and users of Data Safe Haven receive appropriate training before being granted access.

The data will be held on UCL’s Data Safe Haven. The Data Safe Haven is UCL’s technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital DSP Toolkit and ISO 27001 Information Security standard. Access is controlled by the Information Asset Owner, and all UCL staff complete training in confidentiality and data protection, which is renewed annually.

Statistical data analysis will be carried out via UCL owned devices connected to the UCL network either directly in person or remotely, using an appropriate statistical package. To remotely access the server with a remote device requires a secure 2-factor authenticator (VPN) and users are then able to securely access the secure server on the University’s IT framework. All data analysis will be conducted within the confines of the University’s secure server, and will not be downloaded to remote devices for storage or processing.

The study team collects the participant’s NHS Number and Date of Birth at the time of consent/registration in the study only to enable linkage with NHS Digital. NHS Number and Date of Birth will be stored in the UCL Data Safe Haven until they are transferred to NHS Digital for linkage.

Records of all NHS Numbers will be immediately destroyed following the linkage process. Linkage with NHS Digital will enable the study team to capture data on incidence of cancers (including site and type) and deaths (date and cause).

Pseudonymised record-level data provided by NHS Digital will only be accessed and processed by substantive employees of UCL. The pseudonymised data will be stored separately to the Identifiable data on the UCL Data Safe Haven.

HES DISCLOSURE CONTROL / SMALL NUMBER SUPPRESSION
In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, data processors must make sure that:
· National-level figures only may be presented unrounded, without small number suppression
· cell values from 1 to 7 (inclusive) are suppressed at a sub-national level to prevent possible identification of individuals from small counts within the table.
· Zeros (0) do not need to be suppressed.
· All other counts will be rounded to the nearest 5.
Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.


Centre for Longitudinal Studies Birth Cohort Studies Data Linkage: 1970 British Cohort Study — DARS-NIC-49826-T0J7C

Type of data: information not disclosed for TRE projects

Opt outs honoured: N, Identifiable, Anonymised - ICO Code Compliant, No

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 – s261(2)(c)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive, and Sensitive

When:DSA runs 2017-03-31 — 2020-04-01 2017.06 — 2023.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

Datasets:

  1. Hospital Episode Statistics Accident and Emergency
  2. Hospital Episode Statistics Admitted Patient Care
  3. Hospital Episode Statistics Critical Care
  4. Hospital Episode Statistics Outpatients
  5. Emergency Care Data Set (ECDS)
  6. Hospital Episode Statistics Accident and Emergency (HES A and E)
  7. Hospital Episode Statistics Admitted Patient Care (HES APC)
  8. Hospital Episode Statistics Critical Care (HES Critical Care)
  9. Hospital Episode Statistics Outpatients (HES OP)

Objectives:

The Centre for Longitudinal Studies (CLS) is an Economic and Social Research Council (ESRC) Centre, based at the Department of Quantitative Social Science, UCL Institute of Education. It is responsible for three of Britain's internationally renowned birth cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study and the Millennium Cohort Study (MCS). All these studies are 'birth' studies, following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being.

The 1970 British Cohort Study (BCS70) has its origins in the late 1960s, when there was a great deal of concern amongst doctors and others about the number of babies born with abnormalities, or dying very early in life. They decided to compare those mothers and babies who had problems, with those who did not in order to see what could be done about this issue. The simplest way to do this was to study all the babies born in one week. With the help of doctors, midwives, and health authorities throughout England, Wales and Scotland, this study was carried out in 1970.

Information was collected on the family background of the mother, her pregnancy and labour, and about her baby at birth and in the first week of life. Almost 17,500 babies were studied.

It was not for another 5 years that it was decided that it would be worthwhile trying to find the families from the original birth survey to see what had happened to the babies since 1970 – how healthy they were, how they were getting on at school, and so on. This second survey was carried out in 1975. Since then there have been seven other major surveys, attempting to trace all those born in the week of the original 1970 survey – in 1980, 1986, 1996, 1999/2000, 2004/5, 2008 and in 2012 when study members were aged 42.

During the 2012 survey, CLS obtained informed consent from cohort members for their health data to be linked to the data collected in the study. In total consent was obtained from 6181 cohort members who at the time were in England.

Linking health data from Hospital Episodes Statistics (HES) to the BCS70 survey data will greatly increase the possibilities for using the cohort to study how health outcomes impact on the individual and aspects of their life such as work, relationships and family life and, likewise, how health outcomes relate to the individual behaviours and lifestyles choices such as drug and alcohol use, sexual health, diet and exercise, which are all documented as part of the study. The successful inclusion of HES data will enrich these data by revealing which cohort members have been admitted to or attended hospital and the reasons for this, e.g. drug and alcohol treatment, accident and emergency, maternity and mental health services which could help improve understanding of how health conditions could be better treated or supported.

Data about health behaviours may be more accurate if obtained from administrative records as a result of misreporting of complex health conditions, under-reporting of particular health problems or due to perceived sensitivities around certain behaviours and lifestyle choices. So this also offers a valuable methodological opportunity to validate the data collected in the survey and vice versa.

At this stage the aim of the researchers is to;
1. Validate and improve the quality of the cohort data
2. Produce methodological papers describing the quality of the data and its benefit to health and social care
3. Develop and create a useful and rich HES linked (Age 42) BCS70 dataset

UCL will not link the identifiable data they hold with data disseminated from NHS Digital. The only exception would be where a participant wishes to withdraw from the study.

UCL will not share the linked HES/Age 42 BCS70 dataset with third parties.

Expected Benefits:

The BCS70 surveys include questions relating to health outcomes and hospitalisations. CLS will use these responses to compare with their data available on HES to obtain a better understanding of relationship between self-reporting and administrative data. This will be shared via methodological information which will assess the data quality and comparability of two important data sources. This will be of benefit to research looking at health and social care issues which in turn, through time and cost savings will be of benefit to patients.

This data linkage will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of family members. This will be of direct benefit to the NHS and to community services such as those interfacing with schools through informing policy to improve healthy lifestyles.

It is difficult to predict in advance the type of research question that might be put forward. Below are four examples of existing publications using BCS70 data benefiting public health.

GREENE, G, GREGORY, A.M, FONE, D and WHITE, J. (2015) Childhood sleeping difficulties and depression in adulthood: the 1970 British Cohort Study. Journal of Sleep Research, 24(1), 19-23.

VINER, R.M. and TAYLOR, B. (2007) Adult outcomes of binge drinking in adolescence: findings from a UK national birth cohort. Journal of Epidemiology and Community Health, 61(10), 902-907.

CABLE, N, KELLY, Y, BARTLEY, M, SATO, Y and SACKER, A. (2014) Critical role of smoking and household dampness during childhood for adult phlegm and cough: a research example from a prospective cohort study in Great Britain. BMJ Open, 4(4), e004807.

SMITH, L, GARDNER, B, AGGIO, D and HAMER, M. (2015) Association between participation in outdoor play and sport at 10 years old with physical activity in adulthood. Preventive Medicine, 74(May 2015), 31–35.

Below expands further on the benefits of these examples of existing publications using BCS70 data drawing attention on how early life course experiences/exposures shape health outcomes into adulthood.

Greene, Gregory, Fone and White (2015), for example, investigated the relationship between childhood sleeping difficulties (at age 5) and depression in adulthood (age 34), to conclude that severe sleeping problems in childhood may be associated with increased susceptibility to depression in adult life. Adjusting for the potential confounding influences of maternal depression and sleeping difficulties, parental reports of severe sleeping difficulties at 5 years were associated with an increased risk of depression at age 34 years [odds ratio (OR) = 1.9, 95% confidence interval (CI) = 1.2, 3.2] whereas moderate sleeping difficulties were not (OR = 1.1, 95% CI = 0.9, 1.3). Further research, however, is needed to explore whether screening and the treatment of children for poor sleeping patterns might impact upon their mental health in adulthood.

Persistent sleep problems are an increasing health concern. In addition, poor sleep in adulthood has been linked with hypertension, diabetes, depression and obesity, as well as from cancer and increased mortality (Colten and Altevogt, 2006). Therefore, successful identification and treatment for children with sleeping difficulties could, if the association identified by the authors is causal, have large dividends across many aspects of health in the future (GREENE, G, GREGORY, A.M, FONE, D and WHITE, J. (2015) Childhood sleeping difficulties and depression in adulthood: the 1970 British Cohort Study. Journal of Sleep Research, 24(1), 19-23.).

Viner and Taylor (2007) studied outcomes in adult life (at age 30) of binge drinking in adolescence (at age 16). Adolescent binge drinking predicted an increased risk of adult alcohol dependence (OR 1.6, 95% CI 1.3 to 2.0), excessive regular consumption (OR 1.7, 95% CI 1.4 to 2.1), illicit drug use (OR 1.4, 95% CI 1.1 to 1.8), psychiatric morbidity (OR 1.4, 95% CI 1.1 to 1.9), homelessness (OR 1.6, 95% CI 1.1 to 2.4), convictions (1.9, 95% CI 1.4 to 2.5), school exclusion (OR 3.9, 95% CI 1.9 to 8.2), lack of qualifications (OR 1.3, 95% CI 1.1 to 1.6), accidents (OR 1.4, 95% CI 1.1 to 1.6) and lower adult social class, after adjustment for adolescent socioeconomic status and adolescent baseline status of the outcome under study.
The authors draw attention that these associations appear to be distinct from those associated with habitual frequent alcohol use, and binge drinking may contribute to the development of health and social inequalities during the transition from adolescence to adulthood (VINER, R.M. and TAYLOR, B. (2007) Adult outcomes of binge drinking in adolescence: findings from a UK national birth cohort. Journal of Epidemiology and Community Health, 61(10), 902-907.).

Cable, Kelly, Bartley, Sato, and Sacker (2014) findings from BCS70 data give support to current public health interventions for adult smoking and raise concerns about the long-term effects of a damp home environment on the respiratory health of children. The authors examined the associations between childhood exposures to smoking and household dampness (at age 10), and phlegm and cough in adulthood (29 years of age), and found that childhood smoking and exposure to marked household dampness at age 10 were associated with phlegm (childhood smoking: relative risk ratio (RRR) =1.45, 95% CI 1.02 to 2.05; dampness: RRR=2.05, 95% CI 1.07 to 3.91) and co-occurring cough and phlegm (childhood smoking: RRR=1.35. 95% CI 1.08 to 1.67; dampness: RRR=2.73, 95% CI 1.88 to 3.99), while exposure to two or more adult smokers in the household was associated with cough-related symptoms (cough only: RRR=1.28, 95% CI 1.04 to 1.58; phlegm and cough: RRR=1.32, 95% CI 1.06 to 1.64).

These associations were independent from adult smoking, childhood phlegm and cough, early social background and sex. Smoking at age 29 contributed to all symptom patterns, however, a substantial association between household dampness and co-occurring phlegm and cough suggest long-term detrimental effects of childhood environmental exposures.

The authors findings support current public health interventions to reduce adult smoking, but also indicate that the management of childhood risk factors such as exposure to smoke (active or second-hand) and household dampness can be a way to prevent adults experiencing poor respiratory health (CABLE, N, KELLY, Y, BARTLEY, M, SATO, Y and SACKER, A. (2014) Critical role of smoking and household dampness during childhood for adult phlegm and cough: a research example from a prospective cohort study in Great Britain. BMJ Open, 4(4), e004807).

Smith, Gardner, Aggio and Hamer (2015) investigated whether active outdoor play and/or sports at age 10 is associated with sport/physical activity at age 42. Final adjusted Cox regression models showed that participants (n=6458) who often participated in sports at age 10 were significantly more likely to participate in sport/physical activity at age 42 (RR 1.10; 95% CI 1.01 to 1.19). Active outdoor play at age 10 was not associated with participation in sport/physical activity at age 42 (RR 0.99; 95% CI 0.91 to 1.07). The finding authors suggest that childhood activity interventions might best achieve lasting change by promoting engagement in sport rather than active outdoor play (Tammelin et al., 2003a, 2003b) (SMITH, L, GARDNER, B, AGGIO, D and HAMER, M. (2015) Association between participation in outdoor play and sport at 10 years old with physical activity in adulthood. Preventive Medicine, 74(May 2015), 31–35.)


To provide an example of the sorts of benefits to health that this linkage and use of this data may provide, it may be useful to be aware of the impact and benefit to health the 1958 National Child Development Study (NCDS) cohort has made. This is a similar birth tracking cohort still following it's members today. In it’s nearly sixty years research from this cohort has been responsible for proving beyond doubt that mothers who smoked heavily during pregnancy harmed the health and reduced the weight and height of their children, continuing on to damage English and maths scores at 16 years old. The study also informed the debate about the best place to deliver babies, indicating that mothers should only opt for home births when very early transfer to hospital is possible at the first sign of need and where highly experienced midwives and doctors are available. The study repeatedly demonstrated the need for steps to promote the health of pregnant mothers and facilities for safe childbirth. This led to the modernisation of maternity services with ready availability of high quality obstetrics on the one hand and better and more personal care for all. The case was made for adequate numbers of hospital beds and abolition of the lottery of where to give birth. Research has also made use of the longitudinal nature of the NCDS to examine the long-term effects of breastfeeding. For example, Rudnicka et al (2007) demonstrate that, compared with those who were bottle-fed with formula milk, children who were breastfed for more than a month had a reduced waist circumference and waist/hip ratio, and lower odds of obesity as adults in their mid-forties. Research using this cohort has also shed light on cancer and leukaemia in childhood, behavioural disorder, educational delay and disability.

RUDNICKA, A. R, OWEN, C. G and STRACHAN, D. P. (2007) The effect of breast feeding on cardio-respiratory risk factors in adulthood. Pediatrics, 119(5), E1107-15.

BCS70 data is a rich and unique resource for the research and policy community, and the information collected on health and its social determinants widens its potential value for health research and policy interventions. Linking health data from Hospital Episodes Statistics (HES) to the BCS70 survey data will greatly increase the potential of the data and future research in the area of health.

The validation of self reported and hospital reported outcomes will benefit health research in terms of being able to offer research methodologies that are quicker and more cost effective. This will be of benefit to patients who are the recipients of research such as in public health and medical interventions etc. For example, using health administration data such as HES could make the delivery of research more efficient and potentially more accurate, it may increase the volume of research and it may ensure that research takes place into diseases which are currently difficult to fund.

Outputs:

Following the data quality and validation work, the first output will be the creation of the linked BCS70 (Age42)/HES dataset. The HES data will add an important layer to this already rich data as well as providing the means for data quality checking.

The second output will be methodological papers published in peer reviewed journals reviewing the linkage and validating the data from the two data sources. These methodological assessments are expected to finish two years after obtaining the data. Outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide.

The creation of this database and the methodological papers are the first steps in establishing a robust research database which will be of benefit to health and social care. No onward sharing to researchers will take place. Any onward sharing will be subject to a further application to NHS Digital.

The outputs in the long term from this dataset are difficult to quantify, but the CLS currently has a searchable bibliography on it's website with over 3,600 publications based on data from the 1958, 1970 and millennium cohort studies.

Processing:

Data disseminated from NHSD to UCL will only be accessed by substantive employees of UCL and only for the purposes described in this document.

HES data will not be relinked to the identifiable data which is held separately from the survey response data. Re-identification will only happen at the occasion of a request, made from a cohort member, for withdrawal from the study, and this includes removal of data. Where a participant wishes to withdraw from the study, the identifiable data is used to locate the study id, and then in turn destroy their data.

1. CLS team will supply NHS Digital with the following identifiers of cohort members who have consented to this data sharing; sex, postcode, date of birth, NHS number (if known) and unique ID (study-specific pseudonymised identifier).

2. NHS Digital will link the identifiable study data to HES data. NHS Digital will then remove identifiers from linked dataset and return the dataset to the CLS team at UCL with the study ID.

3. CLS will carry out validation of the linked HES data and will combine the supplied HES data with the information collected from the participant as part of the BCS70 study.

Once the linked survey-HES data files have been created, CLS may perform other activities to prepare the data for use in research, such as coding and cleaning, derivation of summary variables and compilation of data documentation.

4. CLS researchers will use these data to create an analysis file that will not contain any identifiable data.

5. CLS will create derived variables that summarise study members’ hospitalisation and health histories (e.g. hospital admissions and re-admissions, incidence of common diseases, children’s ailments etc.), and will compare BCS70 survey data with data from hospital statistics, in order to compare and validate the data collected in CLS surveys.


LAUNCHES QI: Linking AUdit and National datasets in Congenital HEart Services for Quality Improvement. — DARS-NIC-234297-P4M5G

Type of data: information not disclosed for TRE projects

Opt outs honoured: Yes - patient objections upheld, Anonymised - ICO Code Compliant, Yes (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-07-01 — 2022-06-30 2020.09 — 2023.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. HES:Civil Registration (Deaths) bridge
  2. Civil Registration - Deaths
  3. Hospital Episode Statistics Accident and Emergency
  4. Hospital Episode Statistics Admitted Patient Care
  5. Hospital Episode Statistics Outpatients
  6. Civil Registration (Deaths) - Secondary Care Cut
  7. HES-ID to MPS-ID HES Accident and Emergency
  8. HES-ID to MPS-ID HES Admitted Patient Care
  9. HES-ID to MPS-ID HES Outpatients
  10. Civil Registrations of Death - Secondary Care Cut
  11. Hospital Episode Statistics Accident and Emergency (HES A and E)
  12. Hospital Episode Statistics Admitted Patient Care (HES APC)
  13. Hospital Episode Statistics Outpatients (HES OP)

Objectives:

University College London (UCL) requires pseudonymised data of life status, age at life status, place of death and a subset of HES data for use in a new study called “LAUNCHES QI: Linking AUdit and National datasets in Congenital HEart Services for Quality Improvement” (IRAS ID 246796). UCL are the sole data controller who also process data.

LAUNCHES QI aims to indirectly improve services for congenital heart disease (CHD) by providing the first description of how CHD patients interact with the NHS acute sector and where variation in outcomes or service use exist. This information is the first crucial step in supporting service improvement by building the evidence base on which aspects of the current service offer the most potential for improvement programmes. The team will link for the first time NCHDA (National Congenital Heart Disease Audit), PICANet (paediatric intensive care audit), ICNARC CMP (adult intensive care audit), Life status and place of death, and HES (Hospital Episode Statistics) data. This will provide information on: a) the challenges in linking national data sets and whether it is feasible to do this routinely, and b) create a research datasets to examine the interactions CHD patients have with different NHS services over time. The team will aim to improve services by: describing patient care trajectories through secondary and tertiary care; identifying useful metrics for driving quality improvement (QI), informing commissioning and policy; and exploring variation across services to identify priorities for QI.

Measuring, reporting and learning from outcomes should drive quality improvement (QI), but this is particularly challenging for lifelong conditions such as CHD where outcomes need to be interpreted in the context of changing treatment options, service provision and the natural evolution of disease. Given the complex care trajectories of such patients, rich datasets and careful multi-disciplinary analysis is required to identify meaningful variations and opportunities for targeted QI. The study will produce: the first comprehensive understanding of care received by a complex population from birth to adulthood; a basis for creating a step change in how quality in CHD services is measured and improved.

The activity is compliant with the principles of GDPR. Based on guidance, as this is for University research, the lawful basis for processing data is GDPR article 6(1)(e): ‘Processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller’. Also referred to as ‘Public Task’. As the research involves health data, which is included in the definition of special categories of personal data, it requires an additional condition for processing. Based on guidance, for health research this is GDPR article 9(2)(j), which details that processing is necessary for scientific and research purposes, subject to appropriate safeguards.

LAUNCHES QI is a dataset analysis of five linked audit and national datasets which will include up to approximately 144,000 patients with congenital heart disease that have been captured by the National Congenital Heart Disease Audit (NCHDA) since 2000, the core dataset defining the study CHD population. Included will be all patients, with no age restrictions, who have had at least one intervention for congenital heart disease that has generated at least one entry in NCHDA. Patients undergoing cardiac intervention since April 2000 (when NCHDA began) will be included in the study, but the requested records for these patients in HES is from 1997/98 (where available) until March 2017, since these patients may have interacted with NHS services prior to their cardiac intervention. CHD is a life-long condition, so obtaining 20 years of data (including 3 years prior to the NCHDA dataset) is crucial for establishing as complete trajectories of their service use as possible, which will enable the study to get a more detailed understanding of how patients interact with secondary care throughout their lives, how this varies across centres, and how it impacts on outcomes. National data is required to allow exploration of the different CHD services across the country. The pseudonymised linkage of the other 4 datasets to NCHDA at patient level will allow trajectories of care to be determined.

The study research dataset will be generated by linking pseudonymised NCHDA data (English and Welsh centres) to the four additional sources:
1) Paediatric intensive care data from the Paediatric Intensive Care Audit Network (PICANet) based at University of Leeds (English centres).
2) Adult intensive care audit data (Case Mix Programme, CMP) (English and Welsh centres).
3) Hospital Episode Statistics (HES), (English centres).
4) Death registrations (Civil Registration Deaths). (Will link to identifiers from patients of English and Welsh centres).

UCL has received authorisation from HQIP to transfer personal identifiers from NICOR (which curates the NCHDA dataset) to ICNARC, PICANet and NHS Digital. NICOR is located at Bart’s Health NHS Trust. UCL has also received authorisation from HQIP and ICNARC to receive the pseudonymised clinical information at University College London. UCL has Ethics (18/NS/0106) and CAG (18CAG0180) approval for the study to process the data and to link the datasets.

Two datasets will be created: one containing the patient level, pseudonymised information of all hospital admissions, outpatient appointments and procedures for each CHD patient; the other a pseudonymised hospital admission level dataset covering all hospital stays for CHD patients. The final pseudonymised datasets will be stored within UCL’s secure data safe haven.

Patient identifiers will be sent to NHS Digital from the NCHDA database. UCL are requesting that NHS Digital match the identifiers from the NCHDA dataset to HES and Civil Registration Deaths and extract the requested fields. NCHDA will provide a record level LAUNCHES ID which should be transferred to each pseudonymised HES/Civil Registration Deaths record that matches. UCL additionally request that NHS digital generate a unique record-level Study ID that will be common between NHS Digital and UCL to facilitate data queries. The data is to be securely transferred with the record level study numbers to University College London.

The LAUNCHES QI study was instigated and is led by the Principal Investigators based at the Clinical Operational Research Unit (CORU) at UCL. University College London will be the only organisation to require access to the record level data supplied from NHS Digital.

The sole funder, The Health Foundation, are involved in the study only to provide the award, as grant-funding. They will also oversee progress of the project through annual reports and award meetings. The Health Foundation have no influence over the results. They are not permitted to access any record level NHS Digital data.

As in many studies now there are a number of collaborators providing an advisory role. Only UCL substantive employees are working with the data. The organisations involved as collaborators/advisors other than UCL are, Birmingham Children’s Hospital, Intensive Care National Audit & Research Centre, Leeds University, Royal Brompton NHS Foundation Trust, Great Ormond Street Hospital NHS Trust, Leeds Teaching Hospitals NHS Foundation Trust, University Hospital Southampton, and University Hospitals Bristol NHS Foundation Trust. These organisations will not have access to the data. UCL researchers make all the final decisions as data controllers.

Expected Benefits:

The LAUNCHES study addresses important areas related to services in congenital heart disease (CHD) about which there is currently little or no information. In response to national recognition of the urgent need, the study will provide information regarding better system measures for identifying opportunities for quality improvement, trajectories of service use, and variation in CHD service provision.

The main areas of uncertainty addressed by the LAUNCHES study are:
1. There is little known about long term survival and outcomes for CHD patients.
2. Little is known about patient trajectories through the NHS for lifelong conditions like CHD (i.e. for a given patient, their outcomes and sequence of contacts with the NHS over time). The study will generate important understanding about service use and variation, and using this to identify key areas to target improvements in health care quality.
3. Many specialised CHD services are commissioned directly by NHS England. To assure quality, NHS England rely primarily on a Quality Dashboard which necessarily uses fairly crude measures available in the current system. The study will directly address a stated need for better measures of service quality by NHS England by developing new robust service metrics.
4. Other lifelong conditions, particularly those where patients are likely to require periods of intensive care such as Cystic Fibrosis or Renal Disease, will also benefit from their approach for linking audit data to measure quality and variations.

Benefits will include:
1. Designing meaningful outcomes for commissioners, managers, patients and clinical teams that can be interpreted in the context of changing treatment options, service provision and the natural evolution of disease.
2. Connecting different data sources to track outcomes and service use across multiple sectors over a lifetime.
3. Identifying and understanding variations in outcomes to drive quality improvement.

This research will benefit the provision of health care by:

Identifying candidate metrics to inform commissioning and drive change for CHD: Many specialised CHD services are commissioned directly by NHS England. To assure quality, they rely primarily on a Quality Dashboard which necessarily uses fairly crude measures available in the current system. The study will directly address a stated need for better measures of service quality by NHS England. The importance of the research to commissioners is demonstrated through the CO-Is from the relevant Clinical Reference Group and the relevant national audits, and endorsement from NHS England, clinical bodies (BCCA, SCTS, RCSEd) & HQIP.

Generate important new knowledge about lifelong service use & variation for CHD, and target areas for QI: Little is known about patient trajectories through the NHS for lifelong conditions like CHD (i.e. for a given patient, their outcomes and sequence of contacts with the NHS over time). Leveraging the national audits, the study will establish patient trajectories for the CHD population for the first time, generating important understanding about service use & variation, and using this to identify key areas to target improvements in health care quality.

Provide a template for other high-resource, high-impact NHS services: The approach for linking audit data to measure quality and variations will be applicable to other lifelong conditions, particularly those where patients are likely to require periods of intensive care such as Cystic Fibrosis or Renal Disease. The team will share & build on the generalisable learning e.g.: technical & governance solutions for linking national datasets routinely; methods for incorporating commissioning & management perspectives in developing metrics for QI & Quality Assurance (QA); approaches to tracking patient trajectories through multiple datasets.

With approximately 200,000 people currently living with CHD in the UK and services are expensive, high profile and have enormous impact on patients’ and their family’s lives, dissemination is in the public interest. With the possibility of applying findings to other lifelong conditions the magnitude of the impact is potentially even higher.

The researchers will generate the high-quality evidence necessary to guide the quality improvement of CHD services and inform decisions about national policies. Scientific manuscripts will be written detailing the findings. UCL expect the publications to contribute to the evidence and expert opinion for the development and update of clinical guidelines in congenital heart disease in the future. The dissemination via publications and presentations will be completed following conclusion of the study in 2021. The timescale of the full benefit through changes in policies and in procedures is impossible to determine.

Outputs:

The results of the study will be disseminated actively and extensively. The research team has strong links with the Congenital Heart Services Clinical Reference Group, NHS England and clinical bodies including the British Congenital Cardiac Association, the Society for Cardiothoracic Surgery and the Royal College of Surgeons of Edinburgh. UCL also have strong links with CHD charities including The Somerville Foundation, Children’s Heart Federation, The British Heart Foundation and Little Hearts Matter.

Outputs will involve approximately ten publications in peer-reviewed Medical and Scientific Journals, oral and written presentations at national and international conferences. Target journals for the papers are Circulation, Heart, The Annals of Thoracic Surgery, and Archives of Disease in Childhood. The final outputs will only contain aggregate results with small number suppression, in line with the HES Analysis Guidelines.

The LAUNCHES study team plan to write the ten peer reviewed publications to be completed within six months of the study end. The study end date is end of February 2021. Therefore the team are aiming that the publications will be completed by summer 2021 and will present findings to key stakeholders (e.g. professional societies, national audit bodies, the Care Quality Commission, HQIP, commissioners and local hospitals) through meetings and short briefing documents. UCL will disseminate to the public through a project website (https://www.ucl.ac.uk/operational-research/domains/congenital_heart_disease/launches), and via social media (Twitter @UCL_CORU) and blogs. Updates to the website will be ongoing throughout the study and as and when publications and communications are available. The final communication will coincide with the final publication.

UCL will ensure that lay summaries are provided (reviewed in collaboration with patients and parents on their Advisory Committee). The patients and parents on the advisory committee attend annual advisory group meetings to receive updates and to provide feedback on any aspect of the study.

The team have also arranged the dissemination of findings through the Children’s Heart Federation. Where appropriate, results will be promoted as press releases (2019-2021). UCL will also submit reports to the Health Foundation (the project funders) and partner with them to draw on their networks and skills in dissemination and spread to make an impact more widely, which may include generating accessible resources such as downloadable leaflets and case studies, research highlights, blogs and webinars. UCL will also publish details of LAUNCHES QI on the PICANet, ICNARC and NICOR websites as well as on the UCL CORU website. Dissemination will occur throughout the course of the LAUNCHES study (2019-2021).

Processing:

LAUNCHES QI has the necessary research ethics and section 251 approvals. A favourable opinion has been obtained from the North of Scotland Research Ethics Committee, reference number 18/NS/0106. Section 251 support has been received to ensure that the accessing, linking and processing of the datasets is in line with the common law duty of confidence (Ref: 18CAG0180). Data will not be handled by any additional third party organisations. Data will not be accessed outside the UK.

The planned data flows are as follows:
1) Data flows to NHS Digital
National Institute for Cardiovascular Outcomes Research (NICOR) will securely transfer a file to the NHS Digital. This file will contain patient identifiable information (NHS Number, name, postcode, date of birth, local hospital patient ID) for all patients in the National Congenital Heart Disease Audit (NCHDA), and unique study ID (NCHDA record-level LAUNCHES Study ID).
2) NHS Digital will identify common records between NCHDA data and HES and Civil Registration Deaths data, including the requested derived fields. NHS Digital will generate a unique record-level Study ID for each identified record.
3) Data flows from NHS Digital
i) NHS Digital will return HES data for all individuals in the NCHDA cohort to University College London. The unique study ID (LAUNCHES QI record level study number and HES record-level study number) and requested derived fields will be appended to the end of every episode record returned to University College London.
ii) Civil Registration Deaths derived fields for all individuals in the NCHDA cohort will be returned to University College London. The unique study ID (LAUNCHES QI record level study number and HES record-level study number) will be appended to the end of every record.
4) The linkage strategy is that NICOR will provide the personal identifiers NHS number, hospital number, date of birth and postcode to NHS Digital, ICNARC and PICANet from the NCHDA data. Patient name will also be sent to PICANet and NHS Digital. Each will identify in their respective datasets which records pertain to those CHD individuals and return to UCL the requested clinical and administrative data they hold on the matched individuals (UCL will not receive data for patients that do not match), pseudonymised with the LAUNCHES record level ID and local dataset record level IDs. UCL will receive from NICOR the pseudonymised clinical data of the NCHDA dataset and will not receive any personal identifiers.
UCL will then link HES and Civil Registration Deaths data received from NHS Digital to pseudonymised clinical data received from NICOR (NCHDA data), PICANet and ICNARC (CMP data), via the LAUNCHES QI unique record-level study number and the LAUNCHES Patient IDs (which will be contained within the NCHDA data set). UCL will not be receiving any personal identifiers from any of the data sources. UCL will use the record level NHS Digital study identifiers in case of any queries about specific records with NHS Digital.

The data is stored and processed within the UCL Identifiable Data Handling Solution (IDHS) called the Data Safe Haven (DSH). The data will be held within a secure environment where all statistical analyses will be undertaken. Access to this record level data will be limited to only four members of the LAUNCHES QI team, who are all substantive employees at UCL. Staff accessing the UCL data safe haven attend training in its use and security procedures. Staff are also required to complete mandatory annual Information Governance and GDPR training. Each study working on the data safe haven has what is known as its own ‘share’, where the study specific data is kept. Access to this share is granted only by the Principle Investigators who request access for each user. Any team member leaving the study has their access revoked.

Re-identification is not permitted under this data sharing agreement.

Any linkage that could identify an individual is not permitted under this agreement.

No linkage, other than that described within the agreement is permitted and no further data linkage will be undertaken.

As part of LAUNCHES analysis; admission, procedure, diagnosis, discharge, and provider information are required for all matched patients. UCL are also requesting derived fields, age at admission/appointment to four decimal places and age at discharge to four decimal places. UCL are not requesting any dates of birth for this project to prevent identifiability of the data and so LAUNCHES will use ages at health service interaction to determine each patient’s treatment trajectory. In- and out-patient HES data is critical in identifying the treatment that each patient has received. In addition, HES A&E data are required to identify adverse outcomes such as unplanned emergency treatment following discharge from CHD surgery or through deterioration in a patient’s health. The requested Civil Registration Deaths derived fields of life status, age at life status to four decimal places and place of occurrence of death (home/hospice/hospital/care home/other communal establishment/elsewhere), are vital to complete the patient trajectories.

Methods of analysis for the study will include:
1. Data cleaning and descriptive analysis of individual datasets.
2. Develop and update clinical coding maps.
3. Establish and examine variations in longitudinal patient trajectories.
4. Identify candidate metrics to inform routine quality improvement and assurance.

The results of all analyses will be published in aggregate form, with small numbers suppressed in line with HES guidance. No identifiable data will be held by University College London, therefore no identifiable data will be released.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).


Camden & Islington Clinical Record Interactive Search (CRIS) Linkage with HES/Mortality Data — DARS-NIC-408171-X7F8W

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, Yes (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., , Health and Social Care Act 2012 – s261(2)(a); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2021-04-29 — 2024-04-28 2023.05 — 2023.06. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: CAMDEN AND ISLINGTON NHS FOUNDATION TRUST

Sublicensing allowed: No

Datasets:

  1. Civil Registration - Deaths
  2. Demographics
  3. Hospital Episode Statistics Accident and Emergency
  4. Hospital Episode Statistics Admitted Patient Care
  5. Hospital Episode Statistics Critical Care
  6. Hospital Episode Statistics Outpatients
  7. Civil Registrations of Death
  8. Hospital Episode Statistics Accident and Emergency (HES A and E)
  9. Hospital Episode Statistics Admitted Patient Care (HES APC)
  10. Hospital Episode Statistics Critical Care (HES Critical Care)
  11. Hospital Episode Statistics Outpatients (HES OP)

Objectives:

This agreement aims to link HES and Mortality data from NHS Digital with the Camden & Islington NHS Foundation Trust (C&I) Clinical Record Interactive Search (CRIS) Research Database for the purpose of research in the public interest.

Camden & Islington NHS Foundation Trust (C&I) is a large mental healthcare provider serving a geographic catchment area of two inner-city London boroughs, and approximately 470,000 residents. Based on social deprivation scores of 326 local authorities in England, Camden is the 74th and Islington is the 14th most deprived local authority. The variation in the levels of deprivation within both boroughs is large, highlighting the inequalities between different population groups and places. Within Camden there are areas that are within the top 10% most deprived areas in England and areas that are in the 20% least deprived. C&I provides mental health and substance misuse services to people living in Camden and Islington, substance misuse services to Westminster, and a substance misuse and psychological therapies service to residents in Kingston. The Trust has two inpatient facilities, at Highgate Mental Health Centre and St Pancras Hospital, as well as community-based services throughout the London boroughs of Camden and Islington. The Trust provides services for adults of working age, adults with learning difficulties, and older people in community or inpatient settings.

The objective of the data collection is to create a research resource to be used for research projects aiming to investigate physical health outcomes (including mortality) and receipt of health care in people with mental and behavioural health disorders attending secondary mental health care services provided by C&I.

The proposed linkage would significantly increase high quality research outputs that examine the interface between mental and physical health. There is increasing emphasis in the health inequalities experienced by individuals diagnosed with severe mental illness (SMI), commonly defined as schizophrenia, bipolar disorder, schizoaffective disorder and other non-organic psychotic illnesses. These individuals have been found to have a reduced life expectancy of up to 20 years. What is less clear is how other extremely disabling psychiatric disorders, such as severe depression, post-traumatic stress disorder and personality disorders compare in terms of premature mortality, self-harm and physical health comorbidities. By linking HES and ONS mortality data with C&I CRIS data the study team will explore and quantify this currently under-researched health disparity. The study team will focus specifically on commonly occurring comorbidities and adverse complications such as; cardiovascular, respiratory, cancer, liver disease and self-harm.

This data linkage is supported by service user members of the C&I CRIS Oversight Committee, along with local patient and public involvement (PPI) groups the study team have consulted. People using secondary mental health services are rightfully concerned about their risk of premature mortality and morbidity, this linkage has the potential to answer outstanding questions.

Routine recording of Electronic Health Records (EHR)s at C&I commenced in mid-2008 using RiO, an electronic patient record system. RiO contains a comprehensive, longitudinal record of all clinical information recorded throughout patients’ contacts with Trust services, including socio-demographic information, dates and other details of referrals and admissions, detailed clinical assessments, care plans and standardized assessment forms. The record consists of both structured fields (such as dates and pick-lists) and unstructured free text (including progress notes and correspondence). The CRIS tool, developed by South London and Madusley NHS Foundation Trust (SLaM) Biomedical Research Cluster (BRC) to extract information from their bespoke electronic Patient Journey System (PJS), consists of a series of data-processing pipelines which both structure and de-identify fields in the electronic patient record, rendering effectively anonymized data from the full clinical record available at the researcher interface. The system allows researchers to search against any combination of structured and unstructured fields that exists in the database. Users then specify the precise fields they want returned (such as specific diagnostic codes, demographic information and/or a particular text string in a clinical assessment).

University College London (UCL) is C&I’s long-standing research partner in clinical research and this is reflected in the development and operation of the C&I CRIS Research Database. The C&I CRIS Research Database administrator is formally employed with University College London with a substantive honorary research contract with C&I. The C&I CRIS Research Database clinical academic lead holds an academic appointment with UCL in the Division of Psychiatry and a consultant psychiatrist with C&I.

The C&I CRIS Research Database employs the same security model as that developed by SLaM to address the legal and ethical considerations attendant upon the use of confidential health data. Authorized researchers are provided with regulated access to anonymized information extracted from electronic patient records. The Research Database is used to support epidemiological and population-based research using only anonymized data, for which no patient consent is necessary though patients can opt out entirely if they choose.

The data subjects are individuals who:
(i) have received treatment from the Trust between 2012/13 and 2017/18 and some treated earlier where records existed and could be migrated AND who have not notified C&I that they wish to opt out of having their data collected and/or linked, and/or

(ii) individuals who are or have been resident within the London boroughs of Camden & Islington geographic catchment between 2012/13 and 2017/18 and attended hospital for any reason whilst resident in that catchment area.

C&I, the Data Controller, have contracted with the SLaM Clinical Data Linkage Service (CDLS) to carry out certain necessary duties for the processing, supply and hosting of the distinct, C&I CRIS research database. The South London and Maudsley NHS Foundation Trust will act as the Data Processor for the application.

Specifically; the linkage activity, i.e. sending of Patient Identifers and receiving of HES and mortality attribute data, will be conducted by the SLaM Clinical Data Linkage Service (CDLS) and the C&I CRIS - HES and mortality linked data will stored and hosted by SLaM within the C&I specific secure area. The SLaM CDLS provides data processing services (linkage, storage, and data extraction) to external collaborators. All reasonable security steps have been taken to protect data held by SLaM, for example no standalone devices are used and no data is permitted to be stored on the local drives. Furthermore, security measures designed to protect data from being saved outside the SLaM firewall are in place. Data cannot be accessed directly by researchers wishing to use linked data, this means that researchers will only have access to project specific extracts of the C&I-HES/ mortality data. Extractions will be carried out within the C&I secure area within the SLaM firewall by C&I CRIS staff. If appropriate, support is provided to external researchers who wish to access the linked data to help them fulfil the necessary requirements to gain approved status.

In summary, SLaM CDLS, as a Data Processor, process C&I clinical data on behalf of and under the contractual obligation to C&I. C&I maintains exclusive control over access to the C&I CRIS research database as Data Controller. SLaM CDLS has no ability or permission to access C&I data. This contractual relationship is analogous to that of an NHS Foundation Trust and any third-party data processor and host insofar the NHS Foundation Trust remains the Data Controller while the third-party vendor acts as a Data Processor (for example C&I’s relationship with its electronic health record provider). As defined above, SLaM CDLS process the data, governed by the Data Processing Agreement, to fulfil the terms of its contractual obligation to C&I: to create and maintain the C&I CRIS research database. The lawful basis being relied upon to support the flow of confidential patient information from Camden and Islington NHS Foundation Trust (C&I) to SLaM to facilitate the creation of the C&I CRIS Database and this data linkage are: Article 6(1)(e) and Article 9(2)(j) of the General Data Protection Regulation (EU) 2016/679 (GDPR). SLaM are acting as a data processor (as defined in GDPR Article 28) on behalf of C&I who serve as the data controller. Per GDPR Article 28(3), C&I has established a data processing contract and processing agreement for the lawful flow of confidential patient information from C&I to SLaM specifically for the purpose of data processing. SLaM CDLS staff do not have access to the C&I research database.

Access to the C&I CRIS research database is limited to the C&I research database administrator and approved research users at C&I, only for research projects which are approved by the C&I research database oversight committee. Access can only be gained via the C&I network. This includes all data validation and quality checks for pseudonymisation which are conducted by C&I staff following data processing by SLaM CDLS. All external researchers with no contractual arrangements with C&I are required to obtain a Research Passport and Honorary Research Contract prior to project approval. Honorary Research Contracts must be signed by approved research users, their substantive employers, and C&I with wording that the employee will be subject to their substantive employer's disciplinary process if they do anything that they shouldn't with the data. Researchers only have access to pseudonymised linked NHS Digital Data. All research projects are carried out within the C&I network and the linked data remain within the C&I NHS firewall at all times (as is the security model requirement for all analyses of C&I data, regardless of data linkage).

The C&I CRIS Oversight Committee will consider research proposals to use the linked dataset. The Oversight Committee includes research and development governance, Caldicott/ information governance, technical, clinical and service user representation. The Oversight Committee will be responsible for granting or denying approval for all applications to use the linked data. A key consideration of this panel will be to decide if a project poses an increased risk of providing de-anonymised results due to anticipated small cell sizes. Where the panel envisages this to be likely, additional reassurances will be asked for from the applicant and amendments to the application form may be requested. Any successful application will be provided with a bespoke dataset, i.e. one with only the specific variables identified in the application as necessary for the planned analysis. Speculative studies (“data dredging”) will not be permitted.

Research requests vary from year to year. Historically, CRIS has hosted between 5-12 research projects per year from MSc and PhD students at UCL and academic clinical staff at Camden & Islington NHS Foundation Trust. This number has risen in recent years, given the increasing profile of the CRIS research database as a research platform. The study team expect this number to increase following the successful linkage with HES-ONS data which will facilitate more robust, longitudinal analyses. The study team foresee an increase in research projects to 10-18 projects per year. They expect projects pertaining to their established areas of expertise in: severe mental illness, suicide and suicidality, substance use disorders, psychosis, eating disorders, and personality disorders. Understanding the physical health comorbidities of these patients, as well as the causes of mortality, are crucial to research which will elucidate the risk factors and potential interventions to improve care for these individuals.

C&I will only grant access to HES data as part of a linked dataset comprising a minimum of HES and CRIS data (i.e. not for analysis of HES data alone). Broadly, the studies using the linkage have adopted the following designs:
1. Investigations carried out on HES data from the C&I catchment, identifying a HES-derived outcome and comparing its occurrence between people with/without a given diagnosed mental and behavioural health disorders in order to derive standardised morbidity ratios (for example, some current research investigating respiratory disease admissions in people with learning disability compared to the local population);
2. Investigations restricted to people with a given HES-derived outcome and comparing subsequent events between people with/without a given diagnosed mental and behavioural health disorders (for example, further analyses of people with/without a learning disability who have a respiratory disease admission, comparing duration of hospitalisation and risk of readmission between the two groups);
3. Investigations restricted to people with a given diagnosed mental and behavioural health disorders investigating one or more HES-derived outcomes in relation to C&I-derived information (for example, investigating the relationship between mental health symptom profiles and physical health events in people with severe mental illness);
4. Investigations primarily carried out using C&I data, where HES-derived information is used to provide supplementary information (for example, the ability to adjust for serious physical illness in a number of analyses). This includes the use of mental healthcare data contained on HES for residents in the C&I catchment to capture mental health service use by providers other than C&I (e.g. out-of-catchment hospitalisations);
5. Investigations primarily carried out using C&I data where a HES outcome is used to define the sample (for example, a series of analyses investigating medication and health outcomes before and after childbirth in women with pre-existing severe mental illness).

One further planned linkage with Public Health England and the National Cancer Registry is currently under review by Public Health England’s Office for Data Release. This dataset will be stored separately from the proposed HES-ONS-CRIS data linkage and there is no intention or technical ability to link data from the National Cancer Registry to HES-ONS-CRIS linked data.

Expected Benefits:

The over-arching objective of this research programme is to provide information that will assist in narrowing the mortality and physical morbidity disadvantage experienced by people with diagnosed mental and behavioural health disorders. Improvement in the physical health of people with diagnosed mental and behavioural health disorders is highlighted regularly in Government policy and the monitoring of physical health outcomes is increasingly becoming a metric for mental health Trusts, as well as for national structures such as the PHE Mental Health Intelligence Network. The proposed linkage would significantly increase high quality research outputs that examine the interface between mental and physical health. This innovation is supported by service user members of the C&I CRIS oversight committee, along with local patient and public involvement groups the study team have consulted. People using secondary mental health services are rightfully concerned about their risk of premature mortality and morbidity, this linkage has the potential to answer outstanding questions the study team has outlined below.

There is increasing emphasis in the health inequalities experienced by individuals diagnosed with severe mental illness (SMI), commonly defined as schizophrenia, bipolar disorder, schizoaffective disorder and other non-organic psychotic illnesses. These individuals have been found to have a reduced life expectancy of up to 20 years. What is less clear is how other extremely disabling psychiatric disorders, such as severe depression, post-traumatic stress disorder and personality disorders compare in terms of premature mortality, self-harm and physical health comorbidities. By linking HES and ONS mortality data with C&I CRIS data the study team will explore and quantify this currently under-researched health disparity. The study team will focus specifically on common occurring comorbidities and adverse complications such as; cardiovascular, respiratory, cancer, liver disease and self-harm. Self-harm is an interesting example, as instances of this are currently not well covered in the C&I CRIS records, whereas severe self-harm attempts resulting in emergency department attendance will be well recorded in HES data.

In acknowledgement of this health inequality for mental health patients, there have been concerted national efforts over the past two decades to improve health parity. For example, the Department of Health’s ‘no health without mental health’ policy document , and more recently the NHS’s ‘Five year forward view for mental health’ strategic guidance identify reduction of morbidity and mortality for people with diagnosed mental and behavioural health disorders as key targets. Moreover, the issue of increased morbidity and mortality is a key strand in the study team’s clinical work for people with mental health problems locally at C&I. Recent research indicates that people with severe mental illness remain vulnerable. Therefore, understanding the relationship between physical and mental health, and pathways and barriers to receipt of appropriate physical healthcare, is of enduring relevance. The reasons underlying these disparities in morbidity and mortality are complex and thought to be due to a combination of individual and social factors. This may include the long-term use of antipsychotics and adverse social or economic determinations of health (smoking, obesity, inactivity, and illicit drug use), as well as the cumulative effects of deprivation, stigma, social exclusion, which may all contribute to higher rates of cardiovascular disease, respiratory disease, diabetes mellitus and its complications.

Given the study team’s existing data assets which detail pathways of secondary mental health clinical care through the Trust, including clinical free-text along with structured fields, the study team believe that the C&I Research Database will offer greater insights into clinical care than obtaining NHS Digital Mental Health data sets.

Thus far, guidelines on physical healthcare in people with diagnosed mental and behavioural health disorders are mostly extrapolated from studies of the general population without considering more specific risks in those with mental health problems. Given the lack of improvement in health inequalities associated with mental illness, and the persistence of differential morbidity/ mortality, there is a pressing need for further research and more thorough investigation into reasons for general hospital admissions among people with diagnosed mental and behavioural health disorders. Insights from this research can meaningfully inform clinical practices, including care guidelines, which can improve routine care and ameliorate an understanding of how physical health comorbidities present differently and uniquely among those with diagnosed mental and behavioural health disorders. Reducing disparities in morbidity/mortality is a crucial and key goal for overall population health.

There is an existing HES-ONS-CRIS linkage using data from South London and Maudsley (SLaM) NHS Trust (a completely separate data flow from C&I’s proposed linkage). The adverse health impact of people with diagnosed mental and behavioural health disorders has been demonstrated with this powerful data-linkage which the study team aims to replicate and extend. Examples include; a description of the most common reasons for acute hospital admissions in people with severe mental illness and the predictors of admissions with falls and fractures in this group. SLaM have also provided an evaluation of the accuracy of HES discharge diagnoses for ascertaining diagnosed mental and behavioural health disorders, of importance for groups using HES for this purpose. Moreover, a number of publications have used linked HES data to investigate physical health outcomes experienced by people with a recent dementia diagnosis, including investigations of hospitalisations in people suffering dementia with Lewy bodies, the impact of polypharmacy of hospitalisation outcomes, predictors of falls and fractures, emergency department use close to the end of life, and an evaluation of the accuracy of dementia diagnoses recorded on HES. These publications show that linkage of CRIS data with HES and ONS mortality data has clear potential to yield novel and high quality research publication However, there has been little research output regarding:
(1) co-morbidity and premature mortality in non-SMI patient groups, and
(2) pathways to treatment for physical health problems in various patient groups, examining if treatment options are offered equitably and how these effect outcomes and mortality.

Furthermore, all studies presented above are based on data from a single Trust (SLaM) This means important work is needed to replicate findings across multiple Trusts with different patient populations and NHS providers, along with further closing of the gaps in knowledge outlined above.

Physical health disadvantages are likely to cross multiple disorders and multiple levels of morbidity: from mortality to non- fatal conditions, and from the individual impact of serious health conditions to the wider economic impacts of increased secondary care use, longer hospitalisations, and increased risk of readmission. There is therefore a need for a coordinated series of analyses to inform on specific areas of inequality in order to target interventions to improve health. In order to improve morbidity and mortality through health and social care interventions, it is important both to have information on the adverse outcomes potentially underlying disadvantages and to be able to characterise groups most at risk of these outcomes.

Outputs:

The primary output of the linkage is the production and maintenance of a research resource for the purpose of use in informative research analyses for publication in peer-reviewed journals and other standard routes of academic dissemination (e.g. conference presentations).

All secondary outputs (whether tables or visuals) will only include aggregated data suppressed according to the HES analysis guide. Outputs must also comply with the UK Data Service’s Handbook on Statistical Disclosure Control for Outputs including the rules around secondary suppression where applicable.

The study team expect that a minimum of two research papers would be published per year using the proposed data linkages. Examples of papers planned for publication include:

1. Co-morbidity and premature mortality in non-severe mental illness patients. Severe mental illness (SMI) is commonly defined as schizophrenia, bipolar disorder and other psychotic disorders. Published research has explored prevalence and incidence of physical co-morbidities among patients with SMI, but less is known about these comorbidities for other mental diagnostic groups such as depression, PTSD, and personality disorder. The paper will explore if common physical co-morbidities such as cardiovascular disease, diabetes, severe asthma, chronic obstructive pulmonary disease, and cancer are also overrepresented in non-SMI patient populations using secondary care mental health services. By linking C&I CRIS with HES-ONS mortality the study team will compare incidence and prevalence between SMI and non SMI patients, using both published population control estimates as well as matched HES-ONS mortality control data drawn from Camden and Islington boroughs, exploring both at co-morbidity and mortality from the physical conditions known to be overrepresented in SMI. The study team will also explore predictors of mortality and co-morbidity including demographic, social and clinical factors.

2. Mental illness - Pathways to physical healthcare. In order to reduce the physical health inequality experienced by patients suffering from mental health issues, availability and quality of treatment offered prior to and after receiving a comorbid physical diagnosis is of high importance. For example, are patients with mental health problems less likely to receive coronary angioplasty and stents? Are they less likely to receive transplants? Through the linkage of CRIS with HES-ONS mortality, the study team will be able to map out which treatments were offered to patients, and explore how such treatments (or lack of) impacted outcomes of physical and mental health and/or if treatments can be related to cause-specific mortality. The study team will also explore if there is inequality between mental health diagnostic groups in terms of treatment offers and pathways to physical healthcare, and the degree to which social deprivation, ethnicity, diagnosis, medication, age and sex explain any disparities.

All the potential uses of the linked data fall within the stated primary purpose of investigating physical health in people with diagnosed mental and behavioural health disorders. The data being requested will only be used for the purpose described. Any proposed changes will be submitted to NHS Digital for amendment and approval before implementation.

Publication targets will clearly depend on the nature of individual findings and the potential audience envisaged. Where possible, the study team will target general medical and/or public health journals with a broad audience, because analyses are likely to cross disciplines; however, they will also consider specialist journals within the mental health field as well as the individual medical specialties implicated. Dissemination at national and international conferences will adopt a similar strategy of aiming for as broad as possible a reach. They will include mental health focused meetings such as the Royal College of Psychiatrists and European Psychiatric Association congresses, and psychiatric epidemiology meetings such as the International Federation of Psychiatric Epidemiology (IFPE) but they will also seek presentations at medical specialty conferences where results have relevance to those audiences, as well as meetings where commissioners are likely to be represented.

For each application received, the CRIS Oversight Committee, considers the study design and advises on optimisation of benefits. The CRIS Oversight Committee also has a responsibility for publicity and dissemination of findings to relevant parties, media and patient groups.

Patient and public involvement (PPI) is central to the operation and ethical approval for the C&I CRIS research database. There are three service users on the C&I CRIS research database oversight committee. All applications for projects to access the C&I research database are reviewed by a service user.

Separately, the study team also have a Data Science PPI group (chaired by the McPin Foundation – a charity integrating experts by experience into research www.mcpin.org) who comment on and contribute to the design, conduct, and dissemination of studies using the C&I CRIS research database. While this group does not review applications for use of the C&I research database, their advisory role provides important guidance and insights into academic research, including ensuring that research questions are appropriately framed and that research findings are meaningfully contextualised. This PPI group continues to meet regularly to offer their guidance to C&I research database users. The McPin Foundation are not considered Data Controllers as they have any say over the data processing methodology. They provide facilitation support to the separate Data Science PPI group given their expertise in integrating lived experience into academic research. The Data Science PPI group provides important insights and guidance but do not regulate access to CRIS data – that is the remit of the CRIS Governance Board which also includes service user/carer representation.

Processing:

The Clinical Record Interactive Search (CRIS) system contains pseudonymised copies of C&I’s electronic patient records for all patients (i.e. all C&I service users) other than those who exercised their right to opt out of participation.

The study research will start with a broad descriptive analysis and the study team will then focus on two specific projects. These projects demonstrate how the linked fully pseudonymised dataset will be used to investigate physical health service provision to adults who have been referred to C&I services compared with the local population. Outputs from the studies have the potential to rapidly inform local and national health and mental health service developments, especially given the size and well characterised nature of the sample. The longitudinal nature of the data will provide future investigations the opportunity to study the impact of treatment and diagnosis on individual health-related outcomes over time. Result summaries will be fed back to relevant organisations such as NICE, and promoted locally with the aim of directly impacting NHS policies and current patient care.

The justification of using non-consent approaches is; that linking administrative data is preferred over primary data collection because it provides accurate and complete information and efficient use of existing resources. It also has ethical advantages over collecting new survey data, particularly from disadvantaged and vulnerable individuals whose responses are of the greatest importance yet particularly challenging to obtain. These ethical and methodological advantages are of course subject to the security and confidentiality of data linkage, storage and access, and on rigorous information governance and stakeholder consultation procedures.

The study is designed as a series of retrospective clinical cohort studies of adults who have received secondary mental health care, utilising an individually matched dataset containing longitudinal pseudonymised health data on physical health and mortality: Data Requested is listed below:

Hospital Episode Statistics (HES) Critical Care 2012/13 – 2017/18
HES Outpatients 2012/13 – 2017/18
HES Admitted Patient Care 2012/13 – 2017/18
HES Accident & Emergency 2012/13 – 2017/18
Civil Registrations (Deaths) data extract to cover this period.
Demographics data extract to cover this period.

The cohort will be made up of: All adults (aged 18 and over) who have been referred for C&I treatment between 1st January 2008 and 31st December 2018. The sample size is approximately 146,000 adults, and characterised with a range of symptom severity from common diagnosed mental and behavioural health disorders (e.g. depression and anxiety) to severe diagnosed mental and behavioural health disorders (e.g. schizophrenia, bipolar affective disorder), substance use disorders and organic disorders (e.g. neurological syndromes associated with severe intellectual impairment).

Measures: As described in a number of recent studies, C&I CRIS data provides individual level data on sociodemographic (date of birth, sex, ethnicity, neighbourhood deprivation) and time variant data on ICD-10 psychiatric diagnoses, diagnostic assessments, illness severity (e.g. via scales including the Health of the Nation Outcome Scales), risks (e.g. suicidal ideation, physical disability, to others and from others), mental health treatment – frequency of contact, type, professionals involved, local or specialist services, community vs inpatient, medication (e.g. antipsychotics, stimulants, anti-depressants, hypnotics) and psychotherapeutic interventions (individual or group CBT, family therapy, psychodynamic etc.) and treatment adherence. The study team use General Architecture for Text Engineering (GATE) software to develop precise text mining algorithms to extract and code clinically relevant free text data (typed notes) from CRIS.
Hospital Episode Statistics (HES) are held by NHS Digital and include all accident and emergency, hospital admissions and outpatient visits which occur in all hospitals throughout England. This includes important clinical information such as diagnoses, operations or the speciality of the treating clinician; demographic information such as age, sex and ethnicity; and also administrative data such as methods of admission and discharge.

The Office for National Statistics (ONS) collects information on cause of death from an individual’s death certificate; this information is held by NHS Digital in the form of Civil Registration (Deaths) data extracts. This includes diagnosis (using both free text and structured ICD10), and date and cause of death.

Using deterministic matching techniques NHS Digital will link the C&I CRIS and HES/mortality data sets for all patients seen by C&I services. This includes those resident to the boroughs of Camden and Islington. However, it also includes those referred to C&I national and specialist services from outside the catchment area. In addition, pseudonymised HES/mortality data on the residents of the two London boroughs which form the C&I catchment area (Camden and Islington) will also be sought to enable comparison. The HES/mortality data will enhance CRIS data, enabling researchers to explore and identify health inequalities.

The linkage will not generate or collect new data. Rather, it will be a static linking of datasets. Both datasets have been previously created as a matter of course in the performance of service activities by each Data Controller. The utility of the linked dataset created will be demonstrated through the programme of research described below.

Methodology:
1. SLAM CDLS create a cohort (approximately 146,000 individuals) with identifiers to include Study ID (BRCID), NHS Number, Post Code, First Name, Last Name, sex, and Date of Birth and send this to NHS Digital via Secure Electronic File Transfer (SEFT).

2. NHS Digital extracts the HES and mortality data fields requested and removes the identifiers, leaving the Study ID in place. NHS Digital send the pseudonymised extract to SLAM CDLS via SEFT.

3. SLAM CDLS uploads the pseudonymised data to the C&I CDLS data safe haven.

The HES and mortality data will not be linked with patient identifiers from C&I’s electronic patient record and no attempt will be made to re-identify individuals in the data under any circumstances.

C&I will manage and finance the resources required to sustain the proposed database. More specifically, the day to day processes of running the database will be conducted by a collaborative team within the SLaM Clinical Data Linkage Service (CDLS) who are hosting C&I’s data within a C&I specific area within a secure firewall in the SLaM network. Therefore, the day to day processes of hosting the database will be managed by this team. The SLaM CDLS is an impartial, trusted third party service and comprised of a small, dedicated team of informatics, IT, and Information Governance (IG) professionals. The SLaM CDLS is part of both SLaM ICT and Information Governance Departments. There will be no further linkage of the NHS Digital data.

All the datasets will be stored separately and are only accessible to a restricted number of approved technical support staff. Technical staff (all of whom are substantive employees of C&I or C&I’s Data Processor, SLaM CDLS) will then assemble bespoke de-identified linked databases meeting the approved requirements of the research study. These are deposited in shared network drives within the C&I network. For each research database created a different encoded identifier variable (anonym) will be assigned meaning there are no common identifiers or pseudo-IDs across different databases making it impossible for researchers to link their database with source C&I, HES, or Mortality data. This uses a one-way encryption method following which anonyms cannot be reverse engineered.

Microsoft Ltd provide Azure Backup Storage Services for South London and Maudsley NHS Foundation Trust and are therefore listed as a data processor. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data.

When an application has been approved by the CRIS Oversight Committee, technical staff, all of whom are substantive employees of C&I, assemble bespoke de-identified linked databases meeting the approved requirements of the research study. These are deposited in shared network drives within the C&I network.

Approved researchers can only access the data on location within the C&I network. All research databases remain within the C&I firewall at all times on the C&I network. A dedicated office suite has been set up in the Bloomsbury Building onsite at St. Pancras Hospital in order to facilitate analyses using C&I data. Removal of data from this environment is expressly forbidden other than in the form of aggregated summary data with small numbers suppressed in line with the HES Analysis Guide. For each research database created a different encoded identifier variable (anonym) is assigned meaning there are no common identifiers or pseudo-IDs across different databases making it impossible for researchers to link their database with source CRIS, HES or mortality data. This uses a one-way encryption method following which anonyms cannot be reverse engineered. Researchers do not have access to the record level identifiable or pseudonymised HES or mortality data.

At the completion of research projects, the databases used are removed from the shared network drive and archived for a period of 5 years and then permanently destroyed.

HES and ECDS DISCLOSURE CONTROL / SMALL NUMBER SUPPRESSION
In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, you must make sure that:
• cell values from 1 to 7 are suppressed at a local level to prevent possible identification of individuals from small counts within the table.
• Zeros (0) do not need to be suppressed.
• All other counts will be rounded to the nearest 5.
Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.


Understanding the health needs of mothers involved in family court cases — DARS-NIC-196263-J9Q7Z

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - Statutory exemption to flow confidential data without consent, Anonymised - ICO Code Compliant, No (Statutory exemption to flow confidential data without consent)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-12-01 — 2023-11-30 2021.01 — 2023.06. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - Bespoke
  2. Civil Registration (Deaths) - Secondary Care Cut
  3. HES:Civil Registration (Deaths) bridge
  4. Hospital Episode Statistics Accident and Emergency
  5. Hospital Episode Statistics Admitted Patient Care
  6. Hospital Episode Statistics Outpatients
  7. MRIS - List Cleaning Report
  8. Civil Registrations of Death - Secondary Care Cut
  9. Hospital Episode Statistics Accident and Emergency (HES A and E)
  10. Hospital Episode Statistics Admitted Patient Care (HES APC)
  11. Hospital Episode Statistics Outpatients (HES OP)

Objectives:

University College London (UCL) are proposing to link an existing cohort of mothers and babies (held by the research team under a separate Data Sharing Agreement (DSA) with UCL; NIC-393510-D6H1D) with programme information from the Children and Family Court Advisory and Support Service (Cafcass). Women aged between 15 and 50 years, with at least one live birth recorded in the Hospital Episodes Database between 01/04/1997 and 31/03/2017 will make up the cohort. It is estimated that there will be a maximum of 12 million women in this cohort. This cohort of women will be linked to Cafcass data on women involved in care proceedings between 01/04/2007 and 31/03/2019, which includes 113,191 mothers. Those women who are not linked to Cafcass data will act as a control cohort for analysis. This will allow UCL to compare women linked in the Cafcass data to the general population of women giving birth as 97% of women in England who give birth in NHS hospitals.

The Cafcass extract used for this study will be minimised by the UCL research team to only contain information on women aged 15-50 years who were party to section 31 applications (care proceedings where local authorities apply to have a child removed from parental supervision due to serious concerns for the child safety and wellbeing). The extract will contain demographic information on the mothers (date of birth, local authority of residence), information on the children subject to the case (number of children, month and year of birth, sex), and information on the care proceedings (hearing dates, final legal output).

The Cafcass data is held by the Children and Family Court Advisory and Support Service, and available for researchers upon request. (Research must be approved by the Cafcass Research Advisory Committee. The committee checks if the proposed research is feasible, as well as beneficial for Cafcass and for children and families using the service. The research will only be approved if it is relevant to Cafcass’ statutory remit and strategic aims. In this case, the committee has approved the UCL project as being done in accordance with CJCSA section 13 and falling within its strategic aims).

The data under NIC-393510-D6H1D is disseminated for a programme of research within the healthcare provision theme of the Policy Research Unit for Children, Young People and Families (CPRU), within University College London (UCL) which is funded by the Department of Health and Social Care (DHSC). The extract of data from NIC-393510-D6H1D for the purpose of this agreement NIC-196253 will only contain HES APC, A&E, Outpatient data and Civil Registration data on women who had a record of a live birth between the ages of 15 and 50 years old between 1 April 1997 and 31 March 2017.

This request aims to assess the feasibility of using linked health and family court administrative data sets to facilitate research to explore the underlying issues such as long-term health and mental problems. Further, it aims to inform policy about associations between justice system risk factors, such as regional variation in returns to court and health outcomes such as time to next pregnancy, mental health exacerbations during court proceedings.

The study aims to generate evidence about the health needs of mothers involved in public law care proceedings in England. There is clear evidence that mothers, whose children enter public care or are adopted, often have complex health needs, such as drug and/or alcohol misuse, exposure to violence, mental health problems as well as chronic physical conditions.

Key questions are:

Question 1: Are there health characteristics of mothers that are associated with a high likelihood of care proceedings?

Question 2: Among mothers involved with care proceedings, what characteristics are associated with time to subsequent pregnancy, adversity related admissions and adverse outcomes related to court ?

UCL has defined adversity related admissions are admissions related to alcohol- and/or illicit drug use, self-harm, violence and severe mental health. As adverse outcomes, UCL will assess time to next pregnancy (and return to court for further children in the Cafcass data), emergency hospital admissions and deaths.

Key policy questions will be addressed by this study:
1) Is there an unmet burden of health need that could be reduced by interventions in courts, social services or health services;
2) could improved input by healthcare services reduce the chance of being the subject of care proceedings and improve health and welfare outcomes for mother and child.

UCL will attempt to generate research evidence to answer these questions by assessing which health characteristics of mothers are associated with a high likelihood of being involved in care proceedings, and determining what characteristics are associated with time to subsequent pregnancy, adversity related admissions and adverse outcomes related to court.

UCL will explore risk factors known at live birth (such as maternal age, parity, (history of) mental health related admissions, underlying long-term conditions, adversity-related injury admissions) and their association with timing of care proceedings. UCL require whole HES records to assess how and when women use health services, determine whether there is proactive management through outpatients, assess whether health crisis recorded as Accident and Emergency (A&E) attendances or unplanned admissions are related to hearing dates, and asses underlying conditions using women’s medical history as recorded in Hospital Episode Statistics (HES) (e.g. long-term conditions or adversity-related admissions as recorded in ICD-10 or OPCS codes (classification of Interventions and Procedures)). UCL request HES data from 1997 in order to identify the birth episodes for children involved in court cases from 2007 onwards, as well as to identify any long-term conditions in the mothers.

The purpose of this application falls under Article 6 (1) (e) of the GDPR and the lawful basis for using information collected routinely for administrative purposes for research is the ‘public task’. This is part of the University’s commitment to ‘integrate research and innovation for the long-term benefit of humanity’. The application also falls under Article 9 (2) (j), as scientific research.

The study has been designed by UCL researchers without the involvement of the funders (Nuffield Foundation).

There are no alternative, less intrusive ways of achieving the purpose of this study.

UCL are the sole data controller who also process the data under this agreement.

There are no other organisation involved in the study other than those referenced.

Expected Benefits:

The research project in this application informs policy and practice. All analyses undertaken as part of this project aim to provide evidence to inform health care professionals, service providers, policy makers and service users about the health of mothers in the family court system and how services currently meet their needs.

Results will provide evidence of how healthcare need and use of services differs between women who become subject to care proceedings and those who do not. This will provide evidence that can guide the design of potential interventions to reduce care proceedings that Cafcass could test and implement.

This study will examine whether indicators for maternal adversity recorded in healthcare such as a history of admissions for mental health problems, drug/alcohol abuse or exposure to violence can identify groups of women at a live birth event who could benefit from proactive or preventive healthcare input that might reduce risk of involvement in (recurrent) care proceedings and improve health outcomes for both mothers and children.

This information will help family courts as it can identify health needs that health services are aware of at first contact with courts. Improved communication between services could direct potential interventions at this point. Additionally, results will evaluate the healthcare trajectories of women involved in (recurrent) care proceedings. This will provide evidence to assess whether healthcare services can identify (and potentially address) health needs in women at live birth before they go to court.

Evidence is lacking about the extent to which health needs are addressed by services before, during and after court involvement.

The sparse evidence currently available on families involved in recurrent care proceedings suggests this group is particularly vulnerable. Understanding when and how mothers use healthcare, what their health trajectories before care proceedings look like compared to other women with a live birth, and what their healthcare needs are before, during and after involvement in care proceedings will inform interventions aimed at safeguarding vulnerable families (defined as families involved in section 31 care proceedings for this study).

The hypothesis is that a longer time to next pregnancy following a first care proceeding is associated with a reduced risk of recurrent care proceedings, but this association may vary by area, final legal order of index care proceeding and individual risk factors such as presence of long-term conditions or mental health needs in the mother.

This study will also test the feasibility and success of linkage between health and family court data, while evaluating an important policy question about the association between health service use and recurrent care proceedings, taking into account risk factors.

Outputs:

The research will inform Department of Health and Social Care and Ministry of Justice policy makers, service providers and practitioners about patient and service factors associated with the health needs of mothers and children involved in care proceedings in family court. UCL will work with the funder, the Nuffield Foundation to organise a symposium inviting key policy makers and scientists to showcase the findings.

Additionally, UCL will share results with patients who are part of a mental health service user group based at King’s College London and PAUSE, a charity working with women who have experienced, or are at risk of, repeat removals of children from their care. https://www.pause.org.uk/about-us/

UCL has also formed a project advisory group that includes six leading practitioners from across health, family justice and child safeguarding. Part of UCL’s planned discussions with this group will include identifying ways to improve communication about this project and to identify a wide range of organisations including within health, children’s social care, family justice and the voluntary sector with which to share study findings. This group includes a PAUSE practice lead and the programme manager for SHRINE (a human rights -based service in South London delivering sexual and reproductive healthcare to marginalised populations including people with serious mental illness or substance misuse).

Previous public and patient engagement work for this data linkage project has helped to identify further stakeholders to engage with this work. UCL will continue to undertake public and patient engagement work to support interpretation of findings and to discuss methods of sharing study findings to ensure findings are accessible to both policy, practitioners and the study population.

The findings will be published in peer reviewed journals and reports prepared for the funder, the Nuffield Foundation.

The papers resulting from these studies will be published in peer-reviewed journals (such as the Lancet, Archives of Disease in Childhood, PLoS Medicine, BMJ Open) and presented at scientific conferences (such as the, International Population Data Linkage Conference, International Society for the Prevention of Child Abuse and Neglect, Royal College of Paediatrics and Child Health annual conference, and Informatics for Health conference).

UCL aim to present the work at scientific conferences during 2021 and use feedback provided at these meetings to write up papers to be submitted for publication in 2021.

Outputs will be aggregated with small numbers suppressed in line with the HES analysis guide.

Processing:

Data sources

The datasets to be linked are:
1) Hospital Episode Statistics (HES APC, A&E and OPD) and death registration data which contains details of all hospital contacts and deaths in NHS hospitals in England collected from 1 April 1997 to date (data UCL already hold under NIC-393510-D6H1D). Women aged between 15 and 50 years, with at least one live birth recorded in the Hospital Episodes Database between 01/04/1997 and 31/03/2017 will be extracted to make up the cohort to be liked with the Cafcass data.
2) The Children and Family Court Advisory and Support Service database (Cafcass) contains information on family care proceedings, including demographic information on persons involved in court applications (adults and children), hearing and application dates, and proceeding outcomes. Data is available from Cafcass from 2007 onwards. The Cafcass extract used for this study will be minimised to only contain information on women aged 15-50 years who were party to section 31 applications (care proceedings where local authorities apply to have a child removed from parental supervision due to serious concerns for the child safety and wellbeing). Cafcass will minimise the data in this way before it is sent to UCL. The extract will contain demographic information on the mothers (date of birth, local authority of residence), information on the children subject to the case (number of children, month and year of birth, sex), and information on the care proceedings (hearing dates, final legal output).

The study cohort will include all women for whom a section 31 application has been made (by an English local authority) concerning their child(ren) between April 2007 and March 2019. Though most section 31 applications result in a section 31 order being made, a considerable number result in children being placed with extended family under private law orders (such as a Special Guardianship Order) and a small number of applications are dismissed or are subject to an ‘Order of No Order’. Each of these outcomes are equally as important, particularly when considering the heterogeneity of this study population with respect to identifying risk factors for returning to court for subsequent s31 court proceedings.

To mitigate the risk of re-identification, identifiers will be used by the trusted third party (NHS Digital) to carry out the link between the two data sets (Cafcass and HES, via PDS), the final analysis file will only contain pseudonymised non-sensitive variables.

Data flow:
• Cafcass will supply NHS Digital with a list of identifier variables for women aged 15-50 years involved in section 31 care proceedings, including name (first and surname) date of birth, local authority of residence during care proceeding and postcode histories, alongside a study specific pseudo-identifier number (pseudo-study ID) for mothers involved in care proceedings in England that started between 1 April 2007 and 31 March 2017. Only identifiers for women aged 15 to 50 years involved in section 31 proceedings will be flowing. No other Cafcass data will flow. The flow of identifying data from Cafcass to NHS Digital is Practice Direction 12G which enables Cafcass to lawfully share information about family court proceedings for the purposes of an approved research project.

These identifiers will be linked to records held in the Patient Demographic Service (PDS) using an algorithm that prioritises the most recent postcode in Cafcass (at the end of the last case the woman was involved in). The identifiable information disclosed by Cafcass will be used by NHS Digital to facilitate linkage with the Personal Demographics Service (PDS) data in order to identify patients within the HES data.

• For the remaining unmatched Cafcass records, the second postcode in Cafcass will then be compared with PDS as above. This approach will be repeated up to a maximum of 3 postcodes in Cafcass. (Only up to 3 postcodes per person are available from Cafcass for linkage but UCL would expect NHS Digital to compare these with up to 5 postcodes from the PDS).

• A linked file will be created, with the pseudo-study ID provided by Cafcass and the matching NHS number and postcode as recorded in PDS.

• The national data opt-out will then be applied to this linked file and anyone from the Cafcass cohort who has opted out of having their data shared for research or planning purposes will be removed from the linked file.

• UCL will use HES data from an existing data extract (NIC-393510-D6H1D) which is currently held by UCL to create a minimised extract containing HES APC, A&E, Outpatient data and Civil Registration data on only women who had a record of a live birth between the ages of 15 and 50 years old between 1 April 1997 and 31 March 2017.

NHS Digital will disclose a file of pseudonymised HES-IDs for women within the Cafcass dataset to UCL to enable linkage with this existing HES-Civil Registration extract for the purpose of this study. The NHS Digital file will not contain the NHS Number or the postcode.

• NHS Digital will retain the identifier file of all individuals linked in Cafcass-PDS and PDS-HES and all the postcodes used in linkage and postcode dates for 12 months to address data queries. This data set will not contain any attribute data and will be held separately from the final analysis file.

• NHSD will transfer to UCL a list of pseudonymised study IDs for women who link and encrypted HES-IDs for women who link and women who do not link.

There will no requirement or attempt to re-identify individuals.

Data processing is only carried out by substantive employees of UCL who have been appropriately trained in data protection and confidentiality. The data requested will be kept in UCLs Data Safe Haven (DSH). A file transfer mechanism enables information to be transferred into the Safe Haven simply and securely.

The UCL DSH uses Dual Factor Authentication to access and handle data transferred into the DSH service. This ensures that only the named applicants will have access to the data from DSH. Removing data from the Data Safe Haven is only allowed for the Principle Investigator.

UCL will not attempt to link the Cafcass cohort linked to HES and Civil Registration data disseminated under NIC-393510-D6H1D to any of the other datasets held under NIC-393510-D6H1D.

Analysis of the linked dataset will involve the following steps:
Question 1: Are there health characteristics of mothers that are associated with a high likelihood of care proceedings?

• As a first step, UCL will determine the proportion of mothers in the linked Cafcass-HES cohort who have prior indicators of vulnerability (history admissions for mental health problems, injuries related to self-harm, drug or alcohol misuse or violence, exposure to violence, chronic conditions, or young age (<20 years) at (previous) live birth).
• UCL will determine the frequency of hospital contacts (planned and unplanned) for this group of women and describe how this relates to timing of pregnancy and birth, as well as the care proceeding. The findings will inform policy makers about the potential for health services to offer more holistic care for this population of women before they are involved in care proceedings (e.g. are indicators of vulnerability recorded before pregnancy?), and whether proactive care could be warranted during care proceedings (e.g. if women are at risk of exacerbation's such as self-harm admissions during proceedings).

From previous research as part of the Children Policy Research Unit (NIC-393510) UCL will have information on the sub-cohort of mothers with history of risk factors for child maltreatment (e.g. mental health, alcohol/illicit drug, or violence related hospital admissions) as compared to the whole population, for instance the rates and ages at first live birth.
• The main analyses for question 1 will determine whether there are specific factors in the healthcare records before and during birth and immediately after that could identify groups of mothers who might benefit from targeted early intervention. By having a comparator group of mothers not involved in court proceedings, UCL will quantify the relative importance of these risk factors, and identify potential points of intervention for healthcare services (e.g. at antenatal care, or during hospitalisation prior to pregnancy or birth).
• In secondary analyses, UCL will determine whether associations between healthcare risk factors and having a record of care proceedings are affected by indicators of unmet healthcare need such as emergency admissions to hospital or serious healthcare conditions in the mother. UCL hypothesise that contact with healthcare services provides opportunities for interventions, such as the provision of contraception to delay future pregnancies, or treatment for mental health problems, which might improve capacity to parent.
• UCL will use multiple regression techniques to study these associations. This will provide UCL with an estimate of the likelihood of care proceedings in each risk group, adjusted for risk factors such as area-based deprivation levels and underlying health needs.

Question 2: Among mothers involved with care proceedings, what characteristics are associated with time to subsequent pregnancy, adversity related admissions and adverse outcomes related to court?
• Two key outcomes for question 2 are the timing of subsequent pregnancies after an initial set of care proceedings and involvement in recurrent care proceedings, given a further pregnancy. From previous work, UCL would expect that approximately 25% of mothers will have recurrent care proceedings, but it is not currently known how many women have subsequent pregnancies.
• UCL will also analyse a range of secondary outcomes obtained from the health data for mothers, including rates of hospital admission and death. Assessing these outcomes will allow UCL to examine how indicators of healthcare need are associated with the primary outcome ʹtiming of subsequent pregnancies and care proceedings. Findings from these analyses will inform inferences about whether input from healthcare might help prevent further care proceedings.
• UCL will use multivariable survival analyses to determine time to next pregnancy and recurrent care proceedings, and whether this is related to underlying risk factors evident before or during the index care proceedings. UCL will also explore whether any associations are mediated through healthcare events or needs that are indicated by HES records after the first set of care proceedings. For example, UCL will determine whether risk factors such as maternal age at first birth, presence of other children, underlying chronic health conditions, a history of injury due to self-harm, drug or alcohol misuse or violence, are associated with the timing of subsequent pregnancy and recurrent care proceedings and whether these risk factors could be used to identify mothers most likely to experience further care proceedings.
• The analyses will quantify the relative contribution of health events and healthcare needs for mother and/or child to the likelihood of recurrent care proceedings. If UCL find strong associations, UCL could hypothesise that health interventions postpone the timing of subsequent pregnancy and reduce the likelihood of recurrent care proceedings. The findings will provide evidence on the potential for healthcare interventions to reduce the risk of subsequent care proceedings.

When describing and analysing differences in secondary care service use among the study and control cohorts, UCL will use proxy measures for potentially unmet healthcare need such as having recurrent A&E attendances, readmissions to hospital, and emergency admissions. In addition, UCL will look at reasons for and length of emergency admissions; for example, 0-1 night stays, ambulatory care sensitive conditions as well as admissions related to mental health, substance misuse and exposure to violence have been used previously to identify potentially avoidable admissions to hospital.

However, UCL also recognise the limitations of the data. As UCL attempt to answer these policy questions, UCL expect to develop hypotheses that could be addressed in further research focussing on evaluating interventions to respond to and reduce unmet healthcare need in the study population.

UCL have formed a Project Advisory Group that includes practitioners across health, social care and family justice to support the interpretation of study findings. In addition, similar work by researchers in Wales using Cafcass Cymru linked to NWIS (NHS Wales Informatics Service) data within the SAIL data bank will offer opportunities to compare findings across England and Wales to strengthen UCLs interpretation of study findings.

No new data will flow to UCL (University College London) for the purpose of this study. The data for the control cohort has already flowed to UCL under DARS NIC-393510, therefore UCL are not requesting any additional data on hospitalisations, A&E attendances, outpatient appoints, or civil registrations to flow between NHS Digital and UCL.

UCL requires data on the whole-population (i.e. all women with a live birth episode in HES between 15 and 50 years old in England) to accurately quantify sociodemographic and health differences between mothers in England who are involved in s31 applications (i.e. who link to Cafcass) and mothers who are not. UCL also expect these differences to vary by region and by local authority as children’s social care practice and local population characteristics vary considerably across England, as well as by time (i.e. 2007 to 2019). Understanding differences by area and time are crucial to ensuring any study finding are generalisable and relevant to both national and local policy makers and service planners.

For some analyses UCL must use statistical matching methods to match mothers in HES who link (to Cafcass) to controls (mothers in HES who don’t link to Cafcass) based on a number of variables in HES (e.g. age at first HES birth episode, age and number of children at baseline, ethnicity, local authority of residence, comorbidities recorded in HES etc). The variables used in matching, as well as the timing for ‘baseline’ measurements, will vary by analyses dependent on the study outcome. In order to achieve balance across matching variables, UCL require a large pool of controls representative across all local authorities in England. In addition, for particularly rare study outcomes (e.g. such as maternal mortality and adversity-related admissions) UCL require a sufficient number of controls to robustly model outcomes.

The proposed control group would not result in any additional data flowing between NHS Digital and UCL, and a lack of a whole-population control group would impact feasibility of the whole study and UCLs ability to meet the objectives of this study, to ensure that analyses are robust, and to inform policy and practice at the local as well as national level.


MR1450 - National Child Development Study (NCDS) — DARS-NIC-137864-T1P9B

Type of data: information not disclosed for TRE projects

Opt outs honoured: Y, Identifiable, Yes (Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2019-01-18 — 2022-01-17 2018.03 — 2023.06. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. MRIS - List Cleaning Report
  2. Demographics

Objectives:

The National Child Development Study (NCDS) is the second of Britain’s world renowned national longitudinal birth cohort studies. It follows all those born in one week in 1958 through the course of their lives, charting the effects of experiences in early life on outcomes and achievements later on. The study has its origins in the Perinatal Mortality Survey. Sponsored by the National Birthday Trust Fund, this was designed to examine the social and obstetric factors associated with stillbirth and death in early infancy among the children born in Great Britain in that one week. Information was gathered from almost 17,500 babies.

Since 1958 information has been gathered from the NCDS cohort on nine occasions. Over time, the scope of enquiry has broadened from a strictly medical focus at birth, to encompass physical and educational development at the age of seven, physical, educational and social development at the ages of eleven and sixteen, and then to include economic development and other wider factors at ages 23, 33, 42, 44, 46, 50 and 55. The next NCDS survey will take place in 2019 when study members will be aged 61.

In 1958, when the birth survey was carried out, consent to participate in surveys was gained by respondents agreeing to be interviewed or respondents returning the completed questionnaire to the study team. Involvement in subsequent surveys adopted the same approach. Individuals could withdraw from the study at any time by simply expressing the wish to do so.

In all recent follow-ups the approach to collecting consent has been very similar. During fieldwork, study members were sent an advance letter advising them about the survey. The letter was accompanied by an information leaflet explaining what is involved. Study members had the opportunity to request further information, or to opt out of the survey at this point. They could also seek further information, or refuse further involvement when the interviewer attempted to make an appointment to visit; when the interviewer visited and at any point during the administration of any elements of the surveys.

Of the approximately 18,500 individuals that have ever participated in the study there are now approximately 3,500 for whom the Centre for Longitudinal Studies (CLS) at University College London do not currently have a confirmed address. These are not individuals which have informed CLS that they wish to withdraw from the study, CLS have simply lost touch with them.

The ongoing success of the study depends on maintaining contact with as large a number of study members as possible. Therefore, CLS are seeking permission to be supplied with updated addresses for these 3,500 study members whose whereabouts are currently unknown. All of these individuals have made an informed decision to participate in the study over the years and have been made aware that the study is seeking to follow them throughout their lives.

Objective:

Each year CLS sends an annual birthday card postal mailing in March to all NCDS participants. CLS asks that participants complete a ‘reply slip’ which is returned to CLS which allows participants to provide CLS with any change in their details e.g. a new email address, phone number, etc. CLS also ask them to return the reply slip even if none of their details have changed i.e. seeking a positive confirmation that that is the address CLS hold for them.

As a result CLS, can maintain the cohorts' latest details on the NCDS database. In the event of the birthday card not reaching the participant it is returned to CLS as a ‘return to sender’. CLS will attempt to trace all these returns – but if CLS cannot locate the participants then they are flagged on the database as a ‘gone-away’. It is these cases (3500) that are being sent to NHS Digital for list cleaning as the NHS may potentially hold a more recent address and provide CLS with an opportunity to invite the cohort to re-join the study.

NHS Digital will supply new addresses for untraced study members who can be matched to the NHS Central Registry/Personal Demographics Service (PDS).

CLS require to trace lost study members between now and the Age 61 survey in 2019 which is currently in the planning stage. Any study members successfully traced via this route would be written to and asked to provide updated contact details. They will then subsequently be invited to participate in the NCDS Age 61 survey (unless they withdraw from the study).

All those the researchers would seek to trace have participated in at least one prior sweep of the study and none have ever informed CLS that they no longer wish to participate in the study. The researchers feel that a substantial number of these individuals would be willing to participate in the study if they could be contacted. Previous efforts to re-establish contact for other cohort studies have been very successful using this route. When the cohort are contacted they will be given the opportunity to withdraw.

If the participant has died no contact will be made and the study will be updated to reflect this.

Yielded Benefits:

Expected Benefits:

Benefits of the list cleaning:

Submitting the cohort for list cleaning will allow the researchers to recontact the participants who CLS have lost touch with and give them the opportunity to re-engage or clearly state that they wish to withdraw. It will also minimise the risk of literature going to the incorrect address. and contact being made with participants who have died.

Benefits of the Study:

The 1958 cohort, as it approaches age 60, has now entered a critical period for the understanding of the heterogeneous processes of ageing, and in particular how earlier life experiences impact on health and well-being later in life. The continued ageing of the population in the UK and elsewhere make the understanding of healthy ageing a top priority policy concern, across a wide range of health and social policy domains.

The main outcome from the NCDS Age 61 survey will be fully documented, anonymised research dataset and this will be archived with the UK Data Service in early 2022 to provide a strategically important resource for UK Social Science, including researchers in health and social care.

GENERAL BACKGROUND INFORMATION & CONTEXT inc. PUBLICATIONS

The Age 61 Survey will be comprised of two major components:
1) A core interview which will cover the following topics:
- Health, well-being and cognition: physical health, mental health, medical care, health behaviours (e.g. smoking, drinking, diet, exercise), cognitive function.
- Finances and employment: work, income, wealth (savings and debts, pensions, & housing), retirement plans & education.
- Family, relationships and identity: social networks, relationships with partners, parents, children, friends, neighbourhood, social capital, social and political participation, attitudes and values, and religion.

2) A detailed biomedical assessment including measures of anthropometry, physical functioning, cardiovascular risk factors and a full range of blood tests. The central aim of this proposed biomedical assessment is to enable new research that will inform key public health concerns.

The cohort is now transitioning between midlife and early older age, a critical time when biological ageing in key systems (e.g., cardiovascular, metabolic, immunity) start to accelerate, and a series of health conditions that have a profound influence on well-being first become clinically manifest.

The information collected during the Age 61 Survey will enable researchers to uncover life course and inter-generational factors which contribute to healthy ageing among this generation, and thus to inform the development of preventative health policies across the whole of life that will expand healthy life expectancy, and reduce the burden of ill-health and disease at older ages.

Below are some examples of existing publications using NCDS data benefiting public health:

• Power, C., & Matthews, S. (1997). Origins of health inequalities in a national population sample. The Lancet, 350(9091), 1584-1589.
• Hyppönen E, Power C. Hypovitaminosis D in British adults at age 45 y: nationwide cohort study of dietary and lifestyle predictors. Am J Clin Nutr. 2007; 85 (3):860-8.
• Strachan, D.P., 2000. Family size, infection and atopy: the first decade of the 'hygiene hypothesis'. Thorax, 55 (Suppl 1), p.S2.
• Clark C, Rodgers B, Caldwell T, Power C, Stansfeld S. Childhood and adulthood psychological ill health as predictors of midlife affective and anxiety disorders: the 1958 British Birth Cohort. Arch Gen Psychiatry. 2007; 64 (6):668-78.
• Orfei L, Strachan DP, Rudnicka AR, Wadsworth M. Early influences on adult lung function in two national British cohorts. Arch Dis Child. 2008; 93 (7):570-4.
• Johnson W, Li L, Kuh D, Hardy R. How Has the Age-Related Process of Overweight or Obesity Development Changed over Time? Co-ordinated Analyses of Individual Participant Data from Five United Kingdom Birth Cohorts. PLoS Med. 2015; 12 (5):e1001828.

Outputs:

Any study members choosing not to take part in the study are flagged on this the secure confidential address database at the CLS with a code denoting whether their refusal is temporary (i.e. for a particular wave/survey) or permanent (i.e. they wish to have no further involvement in the study). Any previously deposited pseudo-anonymised survey data for a study member and confidential data from the address database are retained unless the study member specifically asks us not to, in which cases this data is securely deleted.

These addresses obtained from NHS Digital will be used to maintain contact with study members e.g. to send them a special birthday mailing for their 60th birthday in March 2018 and then later to invite them to take part in the Age 61 survey. However this data received via this application is never sent or published to the UK Data Service.

The main outcome from the NCDS Age 61 survey will be a fully documented, anonymised research dataset and this will be archived with the UK Data Service in early 2022 to provide a strategically important resource for UK Social Science, including researchers in health and social care.

Processing:

NHS address tracing. CLS wish to use the patient status and tracking products which uses NHS registration data to trace as many NCDS study members as possible, either by finding new address details or verifying existing address details for the cohort.

CLS will supply NHS Digital with a file of around 3500 study members to match to the NHS data. The file supplied will only contain eligible study members who have participated in at least one wave of NCDS. It will not include study members known to have died or to have withdrawn from the study. The file will contain the following data items:

- CLS identifier
- First name
- Last name
- Middle name (where available),
- Date of birth
- Sex
- Last known address, and postcode
- NHS Number

NHS Digital would supply the following details to CLS:

- CLS identifier
- Latest surname
- Latest forename
- Latest middle name (where available),
- Date of birth
- Gender
- Latest address and postcode
- Fact of Death (and embarkations)
- Date of address registration or update
- NHS Number.

In addition to the receipt of any 'new' matched address information for the study members, NHS Digital will add an additional variable that describes the outcome of the matching process to the data that is returned to CLS – that is, this additional variable will allocate each study member to one of the following four categories:

• new/different address found,
• existing address confirmed,
• no match found,
. participant has died.

The data file supplied from NHS Digital, will be processed within CLS only and entered into CLS’s secure database i.e. CLS will load more recent addresses into the database. All NCDS study members contact information is held in this secure confidential address database at the Centre for Longitudinal Studies and used to maintain contact with study members and to invite them to take part in the NCDS Age 61 survey.

Study members newly traced would be written to and invited to re-­engage with the study. Any newly traced study members who on being contacted were to indicate that they no longer wish to participate in the study would be recorded as a 'permanent refusal' on the CLS database and not approached again.

All those accessing the data supplied by NHS Digital are substantive employees of University College London.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).


Assessing the utility of healthcare systems data for trials: data utility comparisons in the STAMPEDE trial (DUCkS)(previously: ODR1718_094) — DARS-NIC-656801-R9F6Z

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, Yes, No (Mixture of confidential data flow(s) with consent and flow(s) with support under section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(2)(c); Health and Social Care Act 2012 – s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 – s261(2)(c)

Purposes: Yes (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-10-24 — 2025-10-23 2022.11 — 2023.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Emergency Care Data Set (ECDS)
  2. Hospital Episode Statistics Accident and Emergency
  3. Hospital Episode Statistics Admitted Patient Care
  4. Hospital Episode Statistics Outpatients
  5. NDRS Cancer Registry
  6. NDRS Linked HES A&E
  7. NDRS Linked HES APC
  8. NDRS Linked HES Outpatient
  9. NDRS National Radiotherapy Dataset (RTDS)
  10. NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
  11. Hospital Episode Statistics Accident and Emergency (HES A and E)
  12. Hospital Episode Statistics Admitted Patient Care (HES APC)
  13. Hospital Episode Statistics Outpatients (HES OP)
  14. NDRS Cancer Registrations
  15. NDRS Linked HES AE

Objectives:

University College London (UCL) are requesting NHS Digital record level data for the Study: "Assessing the utility of healthcare systems data for trials: data utility comparisons in the STAMPEDE* trial (DUCkS)"

*STAMPEDE (Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy) is a clinical study. STAMPEDE aims to identify new treatments for prostate cancer.

Data for this study has previously been share when the data were controlled and managed by Public Health England (PHE). PHE facilitated data release via its Office of Data Release service (ODR). ODR was responsible for providing a common governance framework for responding to requests to access PHE data for secondary purposes, including service improvement, surveillance and ethically approved research. All requests to access data were reviewed by the ODR and were subject to strict confidentiality provisions. The responsibility for the management of the National Disease Registration Service of which the National Cancer Registration and Analysis Service is a part, transferred from PHE to NHS Digital on 1st October 2021. The STAMPEDE study previously accessed data via Public Health England under the reference: ODR1718_094.

The Medical Research Council Clinical Trials Unit at University College London (MRC CTU at UCL) is the Data Controller, who also processes data for this study. Herein forward the Data Controller shall be referred to as University College London or UCL. UCL is the trial’s Sponsor and delegates sponsorship responsibilities to MRC CTU at UCL.

The STAMPEDE study has received funding from Cancer Research UK and the DUCkS study has received funding from Health Data Research (HDR) UK. HDR UK and Cancer Research UK do not determine the purpose(s) the data will be processed for nor undertake any role which defines them as a Data Controllers. HDR UK and Cancer Research UK will also have no access to NHS Digital data other than in the form of aggregated and suppressed tabulations (as per the HES analysis guide).

***THIS VERSION (v1): OCTOBER 2022***
This is a request to Renew and Amend a Data Sharing Agreement (DSA) with University College London.

In order for trials to use healthcare system data, UCL are requesting record level NHS Digital data in order to assess the utility of the data by comparison with trial collected data. The study team intend to do this for data from approximately 10,500 participants from within the STAMPEDE trial.

STUDY AIMS
University College London aim to assess the concordance agreement between traditional trial-specific data collection and healthcare systems data (“routinely-collected healthcare data”) in approximately 10,500 STAMPEDE participants. The analyses will involve assessment of five objectives:

1. Assessment of survival,
2. Chemotherapy treatments,
3. Radiotherapy treatment,
4. Second-line treatment,
5. Toxicities.

UCL are therefore requesting NHS Digital record level health data linked to specific participants of the STAMPEDE trial. The study team will curate the data to be analysed against the five objectives above, working through each in turn.

NEW DATA REQUESTED
UCL are requesting access to the following NHS Hospital Episode Statistics (HES) data sets:
- HES Outpatients (OP)
- HES Admitted Patient Care (APC)
- HES Accident and Emergency (A&E)
- Emergency Care Dataset (ECDS)

The HES data provided from NHS Digital provides additional fields to those previously received from PHE therefore this will enrich the data previously held. To date, approximately 10,500 people have joined STAMPEDE within England and Wales, and recruitment continues (though the study Arms A, H-L are due to close their recruitment around September-October 2022) but this agreement is for a one-off extraction of record-level data, and the study does not anticipate returning for health data on those recruited after the cohort has been provided to NHS Digital under this agreement.

UCL are also requesting further access to the following National Cancer Registration and Analysis Service (NCRAS) National Disease Registration Service (NDRS) datasets (formerly available via Public Health England (PHE):
- NDRS Radiotherapy Data Set (RTDS)
- NDRS Systemic Anti-Cancer Therapy (SACT)
- NDRS Cancer Registry

These data sets, holding information on patients treated with radiotherapy or chemotherapy, allow the audit build of the full picture of the treatment provided to cancer patients, in-depth analysis of specific regimens and changes to prescribed treatments. It allows the exploration of whether the radiotherapy and chemotherapy data items, collected by the Audits from hospitals, are appropriate and necessary. Should particular data be available in existing, national data sets, these data items could be removed from the data collection, to ease the burden on data providers (hospital staff). RTDS data is linked to Cancer Registry data using the ‘Patient ID’ pseudonym; it is not possible to request RTDS data in isolation without this linkage, therefore NDRS Cancer Registry has also been requested.

In order to establish a full picture of the data, the study team are requesting HES and NDRS data from 2005/06 to 2021/22 (where datasets will allow) for approximately 10,500 participants from the STAMPEDE trial.

PREVIOUSLY HELD DATA
To note: The study team currently hold data which was collected via PHE (under ODR1718_094), for the period 2005/06 to 2017/18 for a historic (previously submitted) cohort. This includes:
- NDRS RTDS
- NDRS SACT
- NDRS Linked HES A&E Data
- NDRS Linked HES Inpatient Data
- NDRS Linked HES Outpatient Data

PATIENT AND PUBLIC INVOLVEMENT AND ENGAGEMENT (PPIE)
PPIE was sought to discuss the use of routine data of trial participants recruited before 2013, where it was unknown if they also consented to data linkage (optional participation). Two focus groups of twelve and seven people met in July and September 2021 respectively. They were recruited by Prostate Cancer UK and Cancer Research UK. All who attended have been affected by cancer in some capacity, either a patient or a family member.

Overall, the groups expressed agreement that it was acceptable to access routine data of trial participants without consent provided there was transparency, i.e. information stating that this was being done in line with the reasons given on the trial website. As only a few trial participants (up to 12) did not agree to linkage of their routine data before 2013 and it was not known why, the groups felt it was acceptable to use the routine data as most of the trial cohort would have agreed, and it was not practical to go back to them to ask for retrospective consent. Some felt involvement in the clinical trial should automatically allow for the use of routine data, so participants would have to opt-out or withdraw from the trial if they didn’t agree. Consequently, the privacy notice was revised on the trial website to clearly explain the use of routine data without explicit consent to linkage.

Members discussed how the patient information sheet and consent form should clearly state that routine data will be accessed and that participation in the trial would permit access to the records held by NHS Digital and other data providers. Both documents have provided this information about data linkage since 2013 when all patient-facing documentation was updated.

COMMON LAW DUTY OF CONFIDENTIALITY
The study team at UCL will be providing one cohort for this request containing approximately 10,500 individual records. Within this cohort, approximately 1,600 individuals will be flagged as submitted under Section 251.

The separate legal bases for dissemination are as follows:
1. Informed Consent
Consent to data linkage has been sought for approximately 8,900 participants, spreading over all arms of the STAMPEDE trial. NHS Digital will not apply National Data Opt-Out for these participants and are content that the consent materials are compatible with the flow of data described in this agreement.

2. Section 251 Support from the Confidentiality Advisory Group (CAG)
For STAMPEDE trial arms A-G, approximately 1,600 participants were recruited, however consent for data linkage was not recorded. The dissemination of data for these participants is covered by the section 251 approval which has support from the Confidentiality Advisory Group (CAG). CAG application reference is 21CAG0048. This cohort of participants will have National Data Opt-Outs applied.

NOTE: to date five participants have directly opted out of data linkage by informing the study team. These five and any future participants who directly opt-out of data linkage with the study team will be removed from the cohort and record-level data on these participants will not be disclosed by NHS Digital.

LAWFUL BASIS FOR THE PROCESSING OF PERSONAL DATA (GDPR)
The Data Controller will process the data under GDPR Article 6 (1) (e) - Processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller. As a higher education establishment, the University conduct research to improve health care and service and the linkage requested is necessary for the performance of a task carried out in the public interest; i.e. improving the treatment offered to men with prostate cancer.

Additionally, under GDPR Article 9(2)(j) processing of Special Category Personal Data is necessary for archiving for research purposes. Data minimisation process is being followed and only data that is required specifically for the purposes of this study has been requested, to protect the rights of the data subjects.

-----------------------
Version 0 (JULY 2022) - This request was previously handled by Public Health England (PHE) under the reference ODR1718_094.

OVERVIEW
Project Aims:
1. To identify better ways of obtaining clinical trial outcome data to (A) Repeat reported STAMPEDE analyses, to validate the PhD project's algorithm capabilities and the trial results; (B) Perform new, secondary analyses not possible with conventionally-collected trial data, including but not limited to, rates of neutropenic sepsis when chemotherapy is given at different times in their disease or cardiac events or hormone therapy.
2. To develop a clinically useable tool (e.g. algorithm), for use in a clinical trial, to accurately identify disease-driven events and trial outcomes.
3. To help reduce the burden of collecting clinical trial data from traditional patient and clinician contact. By using data that has already been accurately collected by the NHS, it may be possible to improve timeliness, reduce costs and save resources that can be used elsewhere.

Objectives:
1. Develop, enhance, and validate a methodology to calculate trial outcomes from EHRs. Can the new tool accurately detect disease-related events and trial outcome events, and therefore successfully identify treatment effect differences? The tool must:
a. Either be able to use HES data alone or use non-HES-based data in addition
b. Be clinically useable for planned applications
2. Identify if the routine data can be used directly for clinical trial follow-up through comparison assessments of specific clinical events and treatments recorded in routine data and trial data (survival, chemotherapy, radiotherapy, second-line treatment, toxicities including safety events).
a. If yes, then utilise this data for routine trial follow-up.

COMMERCIAL BENEFIT AND TRANSPARENCY
For transparency it is noted here that the following pharmaceutical companies provided discounted or free drugs used in the STAMPEDE Trial and funding of Educational grants in return for accessing the final primary outcomes report and record-level trial-relate data (NOT ODR/NHS Digital data), but had no influence over how the trial was conducted nor how outcomes were reported.
> Sanofi Aventis, Novartis, Pfizer, Janssen, Astellas and CLOVIS.

These pharmaceutical companies have no involvement with the DUcKS analysis in this agreement.

Yielded Benefits:

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). As such there are some yielded benefits to be observed from the access to the data for the study prior to NHS Digital becoming data controller. These yielded benefits are noted below. Cancer progression and recurrence is not routinely recorded within patient health records, and so it is difficult for clinical trialists to accurately estimate time to cancer progression. Time to cancer progression is often used as a primary trial outcome to determine if a trial intervention is effective or not. The development of an algorithm that uses routinely collected health data to better estimate fact and time of cancer progression should allow trialists to test if a cancer treatment is effective. The data previously obtained from PHE (obtained under ref: ODR1718_094) has enabled MRC CTU at UCL to develop and test an algorithm to estimate cancer recurrence. The results have been written up in a PhD thesis and are also expected to be published as a peer-reviewed research article in 2023.

Expected Benefits:

Benefits type: Knowledge about use of healthcare systems datasets for clinical trials.

The DUCkS trial is in place to assess the utility of healthcare systems data for trials.

The results of the DUCkS project aims to support trialists, funders and NHS Digital in the better use of HES, ECDS, and NCRAS data in clinical trials. Dissemination of the results from the DUCkS study aims to enable the trials community to understand if centrally-collated national datasets can replace trial-specific data collection of important outcomes such as chemotherapy and radiotherapy treatments and the occurrence of serious safety events (for example cardiovascular events and toxicities)

Should the DUCkS study prove that data collection through national datasets is effective, this has the potential to improve efficiency of trials, saving time and money. Although the exact savings for the NHS cannot be predicted, it is hoped this will be achieved through research nurses/practitioner/doctors spending less time on data collection for clinical trials. Consequently, trials could potentially be completed more quickly with less missing data and fewer participants lost to follow-up. This supports innovation and faster research and development of better treatments.

It is hoped that the project could form a basis for HDR UK research activities in the "Transforming Data for Trials" programme over the 5 years from 2023. One of the key aims is to transform the utility of healthcare systems data for trials.

The DUCkS project hopes to pave the way for others to conduct data comparison studies which could determine the utility of other datasets to support clinical trials. Additional studies should provide the trials community with further evidence of which healthcare datasets can be used for specific trial outcomes. This will allow more trials to be designed using these data as the main resource, minimising time and effort by healthcare workers to collect this information. Instead, they would have more time to provide care to patients.

Outputs:

The project aims to complete the analyses by April 2023. The study team plan to write up results (containing aggregate data with small numbers supressed) for publication and dissemination. The study team intend to submit papers to journals by Autumn 2023 and it is hoped these will be ready for publication around Autumn 2024.

If subsequent funding is secured, the study team may wish to retain data to further extend via extension to the agreement the data comparison studies to investigate the utility and completeness of data for other key clinical events in relation to the STAMPEDE trial.

The MRC CTU at UCL plan to communicate the methods used and the results of these data comparison studies via:
- Presentations at major international and national scientific conferences
- Publications in high-impact peer-reviewed journals
- Training workshops disseminating study results to trialists, funders and other key stakeholders.
- Results may also be made available via the HDR Innovation Gateway to enable researchers to decide data relevance for their studies.

The study team also aim to communicate the main STAMPEDE trial results via a booklet to trial participants and to share findings via the trial's website.

The study team plan to write a paper of the methodology used and a paper of the results. The aim is to disseminate the results via workshop around March 2023.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract). There will not be any access to the data by any third parties.

This application is to request the renewal of NCRAS data (previously provided under the agreement with the Office for Data Release ODR1718_094), and a request for additional HES data to cover a new cohort of approximately 10,500 individuals.

The data flow will be as follows:
1. University College London will send two cohort files (of approx. 10,500 patient records in total) each with the identifiers listed below to NHS Digital securely via Secure Electronic Transfer (SEFT). One cohort file will contain individuals processed under Section 251 and one cohort file will contain individuals processed under consent.
2. The NHS Digital data production team will link patient identifiers to the NHS Digital datasets (HES OP, HES APC, HES A&E, ECDS, NCRAS Cancer Registrations, SACT and NCRAS RTDS)
3. NHS Digital will apply National Data Opt-Outs to the cohort of participants received under s251 (approx. 1,600).
4. The NHS Digital production team will then remove identifiable patient information from the linked data.
5. The NHS Digital production team to securely send the de-identified linked data, containing study ID to the data recipient at the University College London via SEFT.

To facilitate the linkage of the STAMPEDE cohorts to HES and NCRAS data, the study team at University College London will securely transfer the following identifiers to the NHS Digital data production team:

FOR CONSENTED COHORT
• Study ID (STAMPEDE trial participant identifier)
• NHS Number
• First Name
• Last Name
• Postcode

FOR SECTION 251 COHORT
• Study ID (STAMPEDE trial participant identifier)
• NHS Number
• Date of Birth
• Postcode
There is no linkage field for gender as the entire trial cohort is male.

These pseudonymised datasets with the Study ID will be sent to the MRC CTU at UCL using the specified transfer method above. The data will reside in UCL’s Data Safe Haven and will be identified by Study ID only, thus there will be no identifying personal data attached to a study number. Only defined members of the DUCkS study team and MRC CTU’s methodology team will have access to Data Safe Haven for data analysis - all are either substantive employees of UCL or Chief Investigator, Comparison Chief Investigators or clinical delegates involved in the blinded review of clinical information for this project only who are under a contract with UCL. All UCL substantive employees have completed training in data protection and confidentiality, and users of Data Safe Haven receive appropriate training before being granted access.

MRC CTU at UCL is not permitted to re-identify individuals under this agreement.

DATA OPT OUT
>National Data Opt-Out
NHS Digital record level data sought under s251 under this agreement is subject to National Data opt-out. If an individual has evoked their right to opt-out from the use of their data for research or planning purposes (the National Data Opt-Out) their data will not be released under this Agreement. This will not apply for participants in the cohort who have provided informed consent.

> NDRS specific Data Opt-Out
NCRAS data is subject to both the NDRS Opt-Out as well as National Data Opt-Out. If an individual has indicated that they wish to be excluded from the national cancer registry, their data will be permanently removed from all NCRAS datasets within 20 days of receipt. This means that for this agreement, for both the cohort under s251 and the consented cohort, no NCRAS data will be released for any person who has registered an NDRS specific opt out which has been completed.

DATA STORAGE AND ANALYSIS
The data will be held on UCL’s Data Safe Haven using UCL approved computers. The Data Safe Haven is UCL’s technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital DSP Toolkit and ISO 27001 Information Security standard. Access is controlled by the Information Asset Owner, and all UCL staff complete training in confidentiality and data protection, which is renewed annually.

Statistical data analysis will be carried out via UCL owned devices connected to the secure UCL network remotely, using an appropriate statistical package. To remotely access the server requires a secure 2-factor authenticator (VPN) and users are then able to securely access the secure server on the University’s IT framework. All data analysis will be conducted within the confines of the University’s secure Data Safe Haven, and will not be downloaded to remote devices for storage or processing. The export of record-level data is not permitted from the Data Safe Haven under access restrictions.

HES and ECDS DISCLOSURE CONTROL / SMALL NUMBER SUPPRESSION
In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, data processors must make sure that:
• National-level figures only may be presented unrounded, without small number suppression
• cell values from 1 to 7 (inclusive) are suppressed at a sub-national level to prevent possible identification of individuals from small counts within the table.
• Zeros (0) do not need to be suppressed.
• All other counts will be rounded to the nearest 5.
Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.


Understanding excess child and adolescent mortality in the UK — DARS-NIC-141410-W6H4Y

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive, and Sensitive

When:DSA runs 2018-11-07 — 2021-11-06 2019.04 — 2023.01. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care
  2. Hospital Episode Statistics Outpatients
  3. Civil Registration - Deaths
  4. Hospital Episode Statistics Accident and Emergency
  5. HES:Civil Registration (Deaths) bridge
  6. Hospital Episode Statistics Critical Care
  7. Civil Registration (Deaths) - Secondary Care Cut
  8. Emergency Care Data Set (ECDS)
  9. Civil Registrations of Death - Secondary Care Cut
  10. Hospital Episode Statistics Accident and Emergency (HES A and E)
  11. Hospital Episode Statistics Admitted Patient Care (HES APC)
  12. Hospital Episode Statistics Critical Care (HES Critical Care)
  13. Hospital Episode Statistics Outpatients (HES OP)

Objectives:

University College London (UCL) are the sole Data Controller who also process data for this project. There are no other organisations involved in the project.

This project was part of a successful application for a Medical Research Council (MRC) Clinical Research Training Fellowship. The project will contribute to a PhD. The funder is not involved in any aspect of the analysis.

The objective of this project is to explore why the rate at which children and young people (CYP) die in the United Kingdom is higher than in many other developed countries.

In the 1970s UK child mortality rates (the number of child deaths per 100,00 population) were similar to those in comparable wealthy nations, and in many areas the UK performed well. Although UK child mortality has been falling since then, the rate of decline has been slower than in other countries, and the UK now has one of the highest child mortality rates in Europe. If the UK had a mortality rate similar to Sweden (the best performing country), about 2000 fewer children would die each year, or 5 fewer a day.

Aims of the study:

1) Identify causes of death in CYP 0-24 where the UK performs poorly compared to similar countries using publicly available data provided by the WHO World Mortality Database.

2) Analyse geographic and socioeconomic variability in mortality outcomes by cause for CYP 0-24 within England and Wales using death certification data provided by the Office for National Statistics (Office for National Statistics. (2017). Death Registrations in England and Wales, 1993 – 2016: Secure Access. [data collection] 2nd Edition. Accessed via UK Data Service).

3) Analyse the contribution of health service factors to mortality for CYP 0-24 in England for causes of death identified in aims 1 and 2 (i.e causes of death where the UK performs poorly internationally and where there is wide geographic and socioeconomic variability in outcomes). This will require analysing data on health service use prior to death provided by Hospital Episode Statistics linked with Civil Registration (Deaths) data, requested from NHS Digital.

Analyses within aim 3 will be performed by age-group / sex as appropriate and will include:
a. Variability in the contribution of health service factors to predominant mortality causes for CYP in England by NHS provider Trust in children and young people 0-24.
b. Variability in the contribution of health service factors to predominant mortality causes for CYP in England by demographic factors (e.g socioeconomic status).
c. An analysis of how the contribution of health service factors to predominant mortality causes for CYP in England have changed over time 2007 – 2017.

For aim 3 processing, the data subjects are all children and young people aged 0-24 who have accessed secondary health services in England between 2007 and latest date available. The data requested is Civil Registration (Deaths) Secondary Care which is to be linked to Hospital Episode Statistics (HES) Accident and Emergency, HES Outpatients, HES Admitted Patient Care, HES Critical Care.

The datasets requested are Hospital Episode Statistics (HES) Accident and Emergency, HES Outpatients, HES Admitted Patient Care, HES Critical Care, Civil Registration (Deaths) Secondary Care. The data are pseudonymised (no identifiable variables will be required).

The data requested will achieve the identified aim by allowing cohorts to be identified of CYP who have accessed secondary health services with the predominant conditions where UK mortality is higher than in other wealthy countries within each age-group and by sex. This will allow a comparison of patterns of healthcare usage amongst CYP who died of these conditions with age matched controls, who did not die but presented to health services with the same diagnosis. The predominant conditions will be determined from the analysis outcomes of aim 1 and 2.

The years of data that will be requested are 2007 to latest date available. These dates were defined due to:

a) Constraints of data availability
In order to examine healthcare utilisation across all types of secondary care use, the study will require data for years where all datasets are available (HES Accident and Emergency, HES Outpatients, HES Admitted Patient Care 2007/2008; HES Critical Care 2008/2009).

b) Examine healthcare use prior to death
Multiple years of data are required to examine healthcare use (planned outpatient appointments / missed appointments / emergency admissions) over three year periods amongst CYP who died compared with those who did not die but attended secondary services with the same diagnosis. These patterns of healthcare use will be used as markers of severity, standard of care received, and predictors of mortality risk in the years prior to death.

c) The need to combine deaths over several years for some causes
It will be necessary to combine deaths/admission episodes over 3-5 year periods due to anticipated low numbers in some age / sex groups / regions of the country for some causes.

d) To examine trends over time
A key aim of this project is for a longitudinal analysis to examine trends in of any association between healthcare utilisation and mortality over time.

The geographic spread of the data (England) will allow for an analysis of how patterns of healthcare use prior to death vary by NHS provider trust/geographic region in England.

The evidence for excess UK mortality extends throughout the early life course, with high total mortality amongst infants and 1-4 year olds, and high non-communicable diseases (NCD) mortality for all CYP age-groups, particularly adolescents and young people (10-24). In order to fully explore the contribution of health service factors to excess CYP mortality causes, and how this varies by age, the project will require data on secondary healthcare use and mortality within England for children and young people 0-24.

It was considered at length whether data could be filtered to specific conditions of relevance. This project will require data for all secondary care attendance in CYP 0-24, linked to mortality outcomes, for all causes over the study period (2007 – 2016). Causes of death will be mapped to the Global Burden of Disease mortality hierarchy across 4 levels. For example, acute lymphoblastic leukaemia (level 4) is classified within leukaemias (level 3), neoplasms (level 2) and non-communicable diseases (NCD) (level 1). Due to the low number of deaths in CYP in each year/sex/age group, it will not always be possible to analyse mortality by level 4 cause, and causes may need to be aggregated by level 3, level 2, or even level 1 group. The level at which a cause of death can be analysed will only be determined after the number of deaths/attendances to secondary care within the dataset (by sex/age group/year) are known. This need to be flexible to allow grouping of causes over different levels depending on numbers of deaths and admissions will mean it will not be possible to perform the analysis if data are only requested on mortality and healthcare use for specific causes. Thus, it is not possible to limit the request to only specific conditions.

The study will only use the minimum amount of personal data required to perform the analyses. The data requested will not be identifying, and will be pseudonymised. Data will then be aggregated by 5-year age group, cause of death group and 3-5 period (by year of death /admission).

The research proposal and dissemination plan were presented to members of the National Children’s Bureau Young Research Advisors (YRAs) group in March 2018, as part of a Patient and Public Involvement and Engagement initiative. The YRAs are a diverse group of CYP recruited from across the country who have received training in research methods and policy. A focus group of 25 young people aged 7-22 (and parents) was held to discuss the acceptability of the research methods (including the use of data without consent). The YRAs were supportive of the importance of the research and the necessity of analysing data without consent. Specific feedback regarding strategies to inform young people and their families of the research were incorporated in to the project proposal and transparency statement.

The individual accessing the data under this agreement is a substantive employee of UCL and are funded by an MRC Clinical Research Training Fellowship which is held by UCL.

The wider study is the PhD project, which includes analysis where the study will use HES data, and the other analyses described in the application. The PhD project has three aims: 1) Identify causes where the UK performs poorly compared with other wealthy nations 2)Analyse variation in cause specific mortality by region of the UK/England and socioeconomic status and then 3)Compare health service use for predominant causes of child and young person mortality amongst children who die to those who did not die. the 3rd strand will be the strand of the PhD which will use the Data disseminated under this agreement.

Expected Benefits:

This project will increase understanding of high UK child and young person mortality, directly impacting on efforts to improve outcomes, and thus enhance the quality of life, health and wellbeing of the population.

The research findings will achieve these benefits by informing public health, healthcare systems, and healthcare financing research. This has the potential to directly influence health policy development for CYP, leading to reform of services and improved outcomes. NHS England is currently developing its 10-year long-term plan, working closely with people on the project including the Principle Investigator, and reducing excess child mortality is a central plank in planned work for CYP. Other countries (e.g Netherlands) have significantly reduced infant and child mortality over the past decade through targeted interventions based upon knowledge of where the problems lie – and this research will provide data to inform similar targeting of interventions in England.

In addition to the moral case for reducing CYP mortality, there are substantial economic benefits. CYP are the workers of the next 20 years and the parents of the next generation. Higher mortality amongst CYP in the UK compared with other wealthy nations puts the UK at a direct economic productivity disadvantage: essentially the UK is losing 1000 potential workers each year compared with the European average, and 2000 per year compared with the best in Europe. Improving the survival of healthy children and young people in the UK will directly contribute to national wealth and productivity.

The number of healthcare users affected by excess CYP mortality, and so who would potentially benefit as a result of this research, is large. Reducing current CYP mortality to be the best in Europe would save 2000 lives a year, or 5 a day, and as UK outcomes are set to further diverge from other wealthy countries, this number is likely to increase.

Analysing the contribution of health service factors to excess UK mortality for CYP 0-24 (requiring the data processing activities described above) will directly influence health service delivery reform. The findings will enable health providers (e.g. NHS England; Clinical Commissioning Groups; Trusts) to identify variation in performance within certain groups of causes of death relating to NHS provider. This will allow local services to learn from the best performing units, and so introduce specific interventions to improve outcomes. These benefits maybe realised within 2-3 years of finalisation of the research.

In the medium term (3-5 years), these findings will benefit research into implementing different models of accessing paediatric specialists in the community, already established in the best performing countries for CYP mortality. The findings may also be directly used to plan studies investigating how to intervene in improving child health services; for example, the Evelina Children and Young People's Health Partnership.

In the longer term (5-10 years), this analysis could be used as evidence to support a fundamental change in the way national health services are delivered for CYP in the UK. This is likely to include improving integration between primary and secondary services, and a move away from the UK’s predominately hospital-centric model. This will improve health service efficiency and sustainability, further benefiting healthcare users.

This study is in support of a PhD research study.

Outputs:

The primary output will be the analysis of secondary healthcare usage amongst CYP prior to death in England for causes where UK mortality is poor, compared with age matched controls. This will provide estimates of contributions of a range of health system and provider factors to excess CYP UK mortality.

The first stage of analysis will be completed within 6 months of gaining access to the data (Jun 2019) and the aim is to publish preliminary results within 1 year (Dec 2019). The final analysis will be completed within 18 months (Jun 2020).

Each sub-analysis within aim 3 will form a separate publication exploring the contribution of health service factors to mortality outcomes by NHS provider trust, socio-economic status and changes over time. The primary targets for publication will be peer-reviewed journals including the Lancet, British Medical Journal and Archives of Disease in Childhood. Estimated publication date for these analyses will be Dec 2019 – Sept 2020.

The wider project will contribute to a PhD thesis which will be submitted to UCL in September 2020.

Findings will also be presented at national and international conferences such as the Royal College of Paediatrics and Child Health (RCPCH) and International Paediatric Association (IPA), and through public and media initiatives organised through UCL and Kings College London. Other professional bodies such as the Royal College of Nursing, Royal College of General Practitioners and the British Association for Child and Adolescent Public Health will provide further opportunities for knowledge exchange and communication to a range of interested parties. Charities focusing on CYP will also be potential partners for dissemination and will include the NSPCC and the Child Accident Prevention Trust, who actively campaign to reduce UK child mortality. All publications, conference presentations, media engagements and other dissemination activities are promoted on twitter, via institutional (UCL) accounts and the Principle Investigator’s (>1500 followers).

The aims, methods and ethical considerations of this project were presented to members of the National Children’s Bureau Young Research Advisors group in March 2018. As part of this process, the Young Research Advisors expressed interest in presenting the main research findings in an accessible way for young people, which will be facilitated by the National Children’s Bureau. This may include a written summary of the report, short videos, animations, or engaging with social media platforms.

All outputs will contain only data that is aggregated with small numbers supressed in line with the HES Analysis Guide.

Processing:

This study will require Civil Registration (Deaths) Secondary Care to be linked to Hospital Episode Statistics (HES) Accident and Emergency, HES Outpatients, HES Admitted Patient Care, HES Critical Care. NHS Digital will perform this linking of data.

HES data on CYP secondary healthcare use in England between 2007 and latest available date, linked with CYP from civil registration (deaths) data, will flow out of NHS Digital to UCL.

The NHS Digital data will be transferred to UCL Data Safe Haven and the NHS Digital data will only be analysed within the UCL Data Safe Haven. Data will be fully anonymised prior to the analysis. The individual level data will be aggregated by cause of death group, 3-5 period (by year of death /admission), 5-year age group (1-4, 5-9, 10-14, 15-19, 20-24) and sex. The anonymised data will then be extracted from UCL Data Safe Haven after analysis.

The analysis of secondary health care usage amongst CYP prior to death in England for causes where the UK performs poorly, compared with age matched controls, will then be performed on the anonymised dataset.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Data will only be accessed by individuals within UCL who have authorisation to access the data for the purpose described, all of whom are substantive employees of UCL.

The data will not be linked with any record level data. There will be no requirement nor attempt to re-identify individuals from the data. The data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).


MR623 - NATIONAL MOTHER AND CHILD COHORT — DARS-NIC-148128-815J1

Type of data: information not disclosed for TRE projects

Opt outs honoured: Y, Identifiable, No (Statutory exemption to flow confidential data without consent)

Legal basis: Health and Social Care Act 2012, Health and Social Care Act 2012 – s261(7); Other-Regulation 3 of The Health Services Regulations 2002

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2017-03-31 — 2020-11-01 2016.12 — 2022.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), NHS ENGLAND (QUARRY HOUSE)

Sublicensing allowed: No

Datasets:

  1. MRIS - Cohort Event Notification Report
  2. MRIS - Scottish NHS / Registration
  3. MRIS - Cause of Death Report
  4. MRIS - Flagging Current Status Report
  5. MRIS - Members and Postings Report
  6. Cancer Registration Data
  7. Civil Registration - Deaths
  8. Civil Registrations of Death

Objectives:

The data supplied by the NHS IC to UCL Institute of Child Health will be used only for the approved Medical Research Project, National Mother and Child Cohort.

Yielded Benefits:

Flagging methods have been established and 100 death and cancer event notifications received for 8877 children reported in the NSHPC with a flagged status until 2017.

Expected Benefits:

It would not be appropriate or valid to share project results at this point in this long-term surveillance activity. The planned later dissemination of findings following access to data under a future version of this Agreement will be in the public interest for the following reasons:

1. The vast majority of children born to women living with HIV were not only exposed to HIV in utero but also to antiretroviral drugs, used to prevent vertical (mother-to-child) transmission and to treat maternal HIV disease. All people with HIV require lifelong treatment with antiretroviral drugs (ARVs) and it is recommended that treatment starts as soon as a new diagnosis is made; early initiation of ARVs with good adherence to medication should result in achievement of life spans similar to those seen in uninfected people. Specifically, for pregnant women, alongside the benefits for their own health, widespread and early use of ARVs has resulted in the risk of vertical transmission decreasing from around 18-25% without treatment to 0.2-0.3% in the current treatment era. The enormous benefits of ARVs within and outside pregnancy are therefore undisputable. However, ARVs have both known and unknown safety concerns when used in pregnancy, highlighting the importance of pharmacovigilance in this uniquely exposed population. Antiretroviral drugs have had reported mutagenic and carcinogenic effects, in addition to haematological and mitochondrial toxicities. The recent safety signal of increased risk of neural tube defects with periconception use of the integrase inhibitor Dolutegravir highlights the evidence gaps around use of newer ARVs in pregnancy. Understanding the effects of HIV and ART exposure in foetal and perinatal life will inform treatment guidelines and contribute to risk-benefit analyses of the use of different combinations of ARVs.

2. There is a growing body of research that suggests children who are HIV-exposed and uninfected (CHEU) have poorer morbidity and mortality outcomes than children HIV-unexposed and uninfected (CHUU). This project will provide an evidence-base for evaluating these outcomes for CHEU in the UK and enable key stakeholders such as PHE and the NHS to design and implement measures that could alleviate health inequities in this growing population.

3. Given that the in utero exposures to ARVs have already taken place in thousands of individuals born to mothers living with HIV, should there be a signal of concern from the data collected then this will require an appropriate, ethical and proportionate response in terms of communication of results requiring expert input from key stakeholders such as the MHRA and PHE.

This project can be considered a first phase in the long-term surveillance of CHEU in England and Wales.

Outputs:

No new outputs will be produced under this Data Sharing Agreement.

There will be no dissemina