Professor Cathie Sudlow is Director of the British Heart Foundation Data Science Centre – a partnership between Health Data Research UK and the British Heart Foundation – which enables research using health data into diseases of the heart and circulation. In this article, Cathie discusses trusted research environments - the benefits they offer and the challenges they can present.
Trusted research environments (TREs) are increasingly being used for research using health data. They can play an important role in maintaining the security and privacy of health data, and so can help with building public trust. They have several major benefits but also some limitations, as explained below.
What is a trusted research environment?
A TRE is a secure computing environment that holds data and enables access to it for analysis. It’s a relatively new way of managing data and there’s no single definition of exactly what a TRE is yet, which is one of the reasons I’m writing this post.
Traditionally, researchers need to download a dataset onto their computer to be able to use it for their analysis. But in a TRE the data remains in a secure location, and approved researchers access it remotely. Researchers cannot take individual level data out of the TRE; they can only export analysis results (such as tables and figures), and only after careful checks have been made. It’s a bit like a ‘reference library’ – approved researchers use the ‘reference library’ to access the data they need for a specific research study, but the data itself doesn’t actually leave the security of the ‘library’.
Holding healthcare data in a TRE helps keep it secure, whilst making it accessible for researchers to conduct research for public benefit in a safe way.
How we use trusted research environments at the BHF Data Science Centre
Our research uses routinely collected NHS healthcare data, including data from GPs, hospitals and death registries. Information that could directly identify any individual (such as name, date of birth, and NHS number) is removed from the data before it is made accessible. We use this information for research to improve our knowledge of heart and circulatory conditions, and ultimately to benefit patients and save lives.
We have worked with data custodians that run TREs across the UK to enable a collaborative group of researchers in the CVD-COVID-UK consortium to access data for important COVID-19 research. The TREs are run by Public Health Scotland, Swansea University (on behalf of NHS Wales), and NHS Digital in England. The consortium includes over 200 members from over 40 institutions across the UK. These researchers have been working together to understand the relationship between COVID-19 and cardiovascular diseases such as heart attack and stroke. The consortium includes public contributors, who review proposals requesting to access data to ensure they are relevant, appropriate and will be of public benefit.
What are the benefits of TREs?
Health data is both sensitive and personal – and members of the public have understandable concerns about how it might be used. People want to know that health data is kept safe and only used in ways that benefit the public and the NHS. People are generally supportive of health data being used for research for public benefit. But they want to be able to find out who is using the data and what research they are doing, and to know that data is used in a secure way that preserves privacy.
Transparency, trust and trustworthiness have to be at the heart of using healthcare data for research. Making data available through a TRE can help, by using technical measures to address many of the concerns people have. The data can only be used for specific purposes, all instances of access to the TRE can be logged and audited, and researchers’ analyses are checked to make sure that no information that might identify a person leaves the secure environment. There are of course some nuances to this. For example, while it’s possible to monitor who is accessing the TRE, other specialised tools are needed to track exactly how they are using the data. As TREs become more commonly used, this is an area which needs additional investment.
A TRE can also help encourage collaboration by allowing researchers from all over the UK to work together on projects, remotely accessing and analysing the same data. One example of a research project we are doing right now, is analysing health data within the NHS Digital TRE to assess the safety of COVID-19 vaccines. We’re looking at whether there is an association between different vaccines and rare blood clots, to shed light on how big any risk is and what that might mean for vaccination policy. We’ll also use data in TREs to better understand the longer-term impact of COVID-19 infection, for example on fatigue, breathlessness and immune function.
It’s only by being able to study data in a TRE for the entire population that we can answer important research questions like this. It’s a particularly powerful way of supporting very large-scale and broad research studies that require access to large amounts of linked healthcare data. Such studies wouldn’t really be possible with any other type of research set-up.
What are the limitations of TREs?
One challenge for us in conducting research into large numbers of people across the whole of the UK - England, Scotland, Wales and Northern Ireland - is that each country’s national data custodian runs its own system and its own TRE. The healthcare data for people in Scotland is managed by Public Health Scotland, the data for England is managed by NHS Digital, and so on. There isn’t one organisation or TRE for all of the UK, so one of our next steps is to work out how to link across TREs that cover different geographies to create a UK-wide picture.
For some types of research, data has to be made available outside of the confines of the data custodian’s TRE. One recent example is the RECOVERY trial, which has been testing different potential treatments for COVID-19. This important trial collects data from its participants in a large database that’s held very securely at the University of Oxford. To follow the health of the trial participants, the trial obtains data on them from NHS Digital and other data custodians. That data has to move to the University of Oxford’s secure setting to be linked into the trial database. Analysis of the combined data allows the results of the trial to be produced so that the research team can find out which treatments are effective and which are not. It’s important to recognise that the health data custodians’ TREs will not support every type of research.
On a practical note, there may also be limitations on the capacity for data custodians to keep up with the demand from the research community to access their TREs. In addition, not all TREs have all of the tools and analytic software that different researchers might want to use, as there are so many available. We work closely with the data custodians to try to ensure that TREs are continuously improved and provide the tools that researchers need to conduct their research studies.
The UK has some of the richest health data in the world. By making this data available to researchers, we can improve our understanding of diseases such as heart and circulatory conditions, and seek ways to prevent, treat and cure them. Ultimately, research using health data saves and improves lives. TREs are a vital tool in enabling this to happen. Ensuring that they can operate effectively will help us maximise the possibilities of using health data for patient benefit, in a transparent and secure way.
If you want to find out more about the work Health Data Research UK is doing on TREs, take a look at this page on their website.