What are Trusted Research Environments (TREs) being adopted in England “Data Saves Lives” Strategy


This is a republication of the article below, with the title above.


Trusted Research Environments (TREs) — A Guide for Beginners


Carnall Farrar
1 January 2022


Edited by


Joaquim Cardoso MSc
Health Transformation . Institute

Research Institute for Continuous Health Transformation
for Better Health for All
June 14, 2022


Executive summary

Trusted research environments (TREs) enable researchers to gain access to data in a safe way. 

They do so by creating highly secure digital environments that provide remote access to information and analytical tools in a single place, removing the need for data to leave the safe location to be sent out to researchers.

TREs are not strictly-defined, but some regard the Five Safes framework as a central requirement to form a TRE. 

This consists of safe people, safe projects, safe settings, safe data and safe outputs.


Significant progress has been made in this area in the last few years forming a number of TREs that may vary in nature, but all offer a solution with the potential to increase public trust in data use. 

With the current debate around patient privacy and health data sharing, there is a need for the public to be aware of the changes taking place in how their data is being stored and used, and the range of measures that are in place to protect their information.

Although TREs are not the perfect solution to all data-based use cases, TREs are a key part of the delivery of the NHS Data Strategy. 

NHS Digital has a TRE service in place for England, which has answered key Covid-19 related questions on cardiovascular diseases and is being expanded to cover cancer.


The technology supporting TREs is rapidly-evolving and exciting new solutions will continue to evolve as needs arise. 

Making data accessible in a safe manner, and communicating this, is a critical area to advance. 

Standards in data and in access are amongst the most important areas for development. 

It is important to recognise that currently TREs are of use in research, but not in addressing clinical or operational needs. 

There is a danger that enthusiasm for things being protected and restricted will prohibit the flow of data and limit research. This cannot be allowed to happen. 

In addition, capability needs to exist to support and deliver the opportunities of TREs. 

We must plan carefully for how this can be supported, including by being relatively flexible about using the capability that exists across universities, the NHS and the private sector.


Structure of the publication

  • Context
  • What is a TRE?
  • Benefits of TREs
  • The Five Safes 
  • Users and creators of TREs
  • Case studies 
  • Limitations 
  • The future of TREs in the NHS 
  • Conclusions

Conclusions


TREs offer an exciting opportunity to increase the safe use of data, including health data, and also to increase public trust in data use.

TREs are not the perfect solution to all data-based use cases and it is important to recognise that currently TREs are of use in research, but generally not in addressing clinical or operational needs.

TREs are however a key part of the future of data use across the UK. This includes the delivery of the NHS Data Strategy.

NHS Digital already has a TRE service in place for the population of England, which has answered key Covid-19 related questions on cardiovascular diseases and is being expanded to cover cancer data.


The technology supporting TREs is rapidly-evolving and new solutions will continue to evolve as needs arise.

Making data accessible in a safe manner, and communicating this, is a critical area to advance.

Standards in data and in access are amongst the most important areas for development.

There is a danger that enthusiasm for things being protected and restricted will prohibit the flow of data and limit research. This cannot be allowed to happen.

In addition, capability needs to exist to support and deliver the opportunities of TREs.

We must plan carefully for how this can be supported, including by being relatively flexible about using the capability that exists across universities, the NHS and the private sector.




Trusted Research Environments (TREs) — A Guide for Beginners


Carnall Farrar
1 January 2022

Context


Although TREs have been developed and used in the UK for over ten years, there has been a recent upward trend in the development and use of TREs, particularly in relation to healthcare data.

Covid-19 resulted in an urgent need to make up-to-date data available, ideally at a national level, to a wide range of researchers and users. TREs provide an efficient solution to granting access to data, in some cases allowing researchers remote access to data, which eliminates the need for individuals to move to physical safe rooms to make meaningful analyses.

Researchers require access to sensitive data. Sensitive data can include personally identifiable information such as names and addresses, commercially sensitive information, and also information which has been de-identified, but remains sensitive due to the potential for re-identification.

This is amongst a backdrop of current concerns and misinformation around data use amongst some members of the public. TREs address some of the growing concerns around privacy and data protection. For example, the widely-used method of data being transferred to researchers is becoming less favourable, whilst provision of data in TREs is becoming more favourable.

The TRE landscape is rapidly evolving due to growth in demand for breadth and depth of health data and integration of data across more sources (including regions across the UK), and enabled by the evolution in technology (including increases in capacity for processing speed and technology solutions, and reductions in cost of storage including cloud storage solutions). Regarding the increasing need for depth of data, the risk of re-identification increases as data on individuals is linked across a wider range of datasets.

CF recently conducted a landscape review of existing data infrastructure across the research councils of the UK, highlighting the variety of technologies in use, and the issues users currently face.[1] This work supported the UK trusted and connected Data and Analytics Research Environments programme (DARE UK), an ambitious project focused on connecting existing TREs. This piece summarises our thinking in the wider context of healthcare data specifically.



What is a TRE?

TREs are technical solutions that enable researchers to access and make use of data in a safe and secure manner. Also known as Data Safe Havens, they are highly secure digital environments that provide remote access to information and analytical tools in a single place.

Although not strictly-defined, TREs can be described by a harmonised set of principles that TREs abide by. TREs can be thought of as similar in concept to a reference library, whereby researchers come to the library to use the information, but the information does not leave this library.[2]

Use of TREs reduces the risk of unauthorised individuals obtaining access to the information, and the risk of a user being able to re-identify individuals from de-identified data.


Benefits of TREs

There are numerous benefits to forming and using TREs in healthcare settings. Most importantly, TREs hold data securely, ensuring patient privacy is maintained. Data that could be used to directly identify an individual is not available in TREs (for example name, date of birth, address or NHS number). De-identified data on individuals also cannot be taken out of a TRE. Only approved summary tables can be extracted from a TRE once analysis has been done (for example, aggregated tables of counts or graphs).

Additionally, TREs allow for increased oversight on what data is being used for. For example, audit processes can identify which datasets are being accessed and by whom.

TREs not only protect data, but can open safe data to approved projects, which are not limited by geography. Research with TREs has the potential to enable deeper insights and to accelerate research. Making data more accessible and promoting interaction across users of the same TRE platform may also enable collaboration across disciplines.

Finally, demonstrating trustworthiness through safe use of data by using a TRE has the potential for researchers and data controllers to build public and patient trust.



The Five Safes

Many organisations are promoting the use of the Five Safes framework for the responsible use of data, developed by the Office for National Statistics.[3] Although TREs are not strictly-defined, these Five Safes are regarded by some as requirements for TREs.[4]

  • Safe People — Researchers must be trained and approved
  • Safe Projects — Projects involving data must be ethical and approved
  • Safe Settings — Secure technology means data never leaves the safe location, i.e., the TRE
  • Safe Data — Data used by researchers must be de-identified
  • Safe Outputs — Outputs must be checked to be sure they cannot be used to identify an individual

A recently-published White Paper on TREs by the UK Health Data Research Alliance and NHSX outlines detail on best practice for TREs, as well as the principles of research across an ecosystem of TREs.[5]


Users and creators of TREs

TREs are used by a range of organisations including research institutions such as universities and research councils, government bodies and related organisations (e.g., the Office for National Statistics), the NHS and the NHS Digital TRE, charities (e.g., Cancer Research UK) and private companies such as Palantir and its Foundry platform.

TREs are created by a variety of organisations. Some TREs such as the NHS Digital TRE have been built in-house. Others use external TRE providers such as AIMES and Aridhia.



Case studies

Examples of key TREs in the UK include:

  • NHS Digital TRE for England which provides academics access to cardiovascular and cancer data for Covid-19 research
  • Public Health Scotland’s National Safe Haven which provides access to health and administrative data via the electronic Data Research and Innovation Service (eDRIS) platform
  • HSC Northern Ireland’s Honest Broker Service TRE which provides researchers with health and social care data on the population of Northern Ireland
  • SAIL Databank which provides researchers around the world with secure remote access to datasets with anonymised person-based health and social care data records for the population of Wales
  • ONS Secure Research Service which gives accredited or approved researchers secure access to de-identified, unpublished data (including the Census) for research projects for the public good
  • Genomics England Research Environment which has over 2,000 researchers carrying out analysis with a range of tools in its high performance computing environment, with genome data from the 100,000 genomes project[7]
  • OpenSAFELY which is a secure analytics platform for NHS electronic health records of over 58 million people from England
  • CO-CONNECT which standardises antibody data collection across the UK and provides access to data for research on immunity to Covid-19
  • UK Biobank which is starting to provide data on half a million study participants via the new research analysis platform developed by DNAnexus and Amazon Web Services
  • Palantir’s Foundry NHS Covid-19 Data Store which sits on a Microsoft Azure platform under contract with NHS England and NHS Improvement. Within that secure cloud processing environment, Palantir manage their Foundry platform, from which data and code do not leave[8]

The SAIL databank and OpenSAFELY are outlined in more detail below.


Limitations


Despite being a clear step forwards for research, TREs are not a perfect solution for all uses in all scenarios.

Variation: TREs have largely been developed as independent efforts across countries, sectors and disciplines, and therefore there are vastly different TREs across the UK and internationally. Some of these have been developed in-house and others by TRE providers. There is therefore variation in the set-up of TREs and their remits, and they thus cannot all support all types of research.

Use cases: Crucially, there are limitations to what can be done with a TRE, by design. For example, research where re-identification of patients is required for clinical purposes is not currently possible with most TREs. As an example, Genomics England has established a process by which a researcher who detects a diagnosis can flag this and trigger a process to alert relevant clinicians. TREs are currently designed for research purposes as opposed to clinical purposes or operational use.

Interoperability: Furthermore, researchers can experience difficulties in working with data across different TREs, as the data is effectively and intentionally siloed away from data in other TREs. The fragmentation of TREs means they are often not operating to common data standards. TREs exist in separate ‘kingdoms’ and there can be insufficient information available about what exists in the TRE, the limitations of the data and who else is using it. Connecting data between TREs is of interest to many.

Tooling and user support: Moreover, tooling in a TRE is limited to what is available from the provider. This does not therefore support all needs of each researcher in every case, including the application of advanced analytical tools such as Python, R and Stata. In addition to this, TREs require staffing to support ongoing activities such as access to information and checking of outputs. There have been reports of researchers not being able to access the required support or long delays to access, signalling insufficient support staffing.

Automation: There are clear opportunities for TREs to become more sophisticated as technology evolves. For example, output checking processes are currently manual and require manual analysis. The outputs from TREs are strictly controlled. Automation of this process could be more time-efficient and less error-prone.

Scale and timeliness: As no de-identified patient-level data can be imported or extracted, advanced applications of technology such as machine learning, that needs large scale data, are close to impossible. Data can also be from static snapshots as opposed to real-time. Design and use of TREs must accommodate innovation from communities outside of NHS providers, and the data requirements for advanced applications.



The future of TREs in the NHS


TREs are a crucial element in creating integrated data across multiple settings for the 65 million people across the UK. They can permit an increase in research on both retrospective and prospective data and thereby support aspirations of UK life sciences. The aim of TREs is to make data easier to access safely, to accelerate research. There are some key steps to achieve this aim:

  • Establish more working examples of TREs so that people can understand what they enable
  • Engage the public about what TREs are and how their data is being used
  • Open TREs and collaborate with a range of researchers, not just those in academia
  • Support development of TREs with smart access processes and the necessary tooling
  • Investigate enablers of linkage across TREs, standards and accreditation of TREs
  • Demand transparency about what TREs contain and their capabilities
  • Engage more across the NHS, with industry and private sector partners

TREs are a key part of the draft NHS Data Strategy. The strategy confirms TREs will be increasingly looked to for solutions, and commits to continue to examine how they operate and are governed.[11]

NHS Digital has a TRE service in place for England, providing approved researchers with access to essential linked, de-identified health data to answer Covid-19 related research questions.[12] The NHS Digital TRE abides by the Five Safes framework, and approved researchers gain access to the secure Data Access Environment to conduct their analysis.

The first client was the British Heart Foundation Data Science Centre, in partnership with Health Data Research UK (HDR UK), for which the NHS Digital TRE service created an environment, capable of answering complex research questions. HDR UK does not run TREs, but focuses on leveraging the capabilities of existing TREs across the UK, including those of NHS Digital, Public Health Scotland/EPCC, SAIL Databank, HSC NI Public Health Agency and the Office for National Statistics. [13] More recently, researchers from DATA-CAN, the UK’s health data research hub for cancer, have been able to access de-identified cancer data via the NHS Digital TRE, enabling Covid-19 research on cancer referrals, diagnoses and treatment.[14]

As widely-reported in the news, NHS Digital has been planning the daily collection of GP data to support health and care planning and research, known as General Practice Data for Planning and Research (GPDPR).[15] The backlash against the proposal and timeline resulted in a delay to its implementation. Implementation of GPDPR is set to begin only once specific criteria have been fulfilled. One of these criteria is that a TRE must be developed and implemented to hold the uploaded data, prior to the collection of data.[16] In addition to this, another criterion is that patients must have been made more aware of the scheme through a campaign of engagement and communication.

TREs are a mechanism of providing researchers with safe access to data. Their strengths and limitations must be communicated clearly to the public, patients and clinicians, so that an informed public debate can lead to acceptable data use. GPDPR provides an opportunity to both strengthen the NHS Digital TRE and also to inform and engage the public, patients and clinicians (particularly GPs) on data use.



Conclusions


TREs offer an exciting opportunity to increase the safe use of data, including health data, and also to increase public trust in data use. 

TREs are not the perfect solution to all data-based use cases and it is important to recognise that currently TREs are of use in research, but generally not in addressing clinical or operational needs. 

TREs are however a key part of the future of data use across the UK. This includes the delivery of the NHS Data Strategy. 

NHS Digital already has a TRE service in place for the population of England, which has answered key Covid-19 related questions on cardiovascular diseases and is being expanded to cover cancer data.


The technology supporting TREs is rapidly-evolving and new solutions will continue to evolve as needs arise. 

Making data accessible in a safe manner, and communicating this, is a critical area to advance. 

Standards in data and in access are amongst the most important areas for development. 

There is a danger that enthusiasm for things being protected and restricted will prohibit the flow of data and limit research. This cannot be allowed to happen. 

In addition, capability needs to exist to support and deliver the opportunities of TREs. 

We must plan carefully for how this can be supported, including by being relatively flexible about using the capability that exists across universities, the NHS and the private sector.


References


See the original publication

Originally published at https://www.carnallfarrar.com on January 1, 2022.

RELATED ARTICLES

Total
0
Shares
Deixe um comentário

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *

Related Posts

Subscribe

PortugueseSpanishEnglish
Total
0
Share