How is data kept safe?

It is essential that patient data is kept safe and secure, to protect your confidential information.

There are many ways that your privacy is protected. Read on to learn more, or click here for a short visual explainer.

A common framework for assessing and explaining how data is being kept safe is the ‘Five Safes’ Framework.

This was initially developed by the Office for National Statistics and other data providers, and has increasingly been adopted by other organisations as a framework for developing safe data access systems. The five elements are:

As well as this framework, there are also many underlying laws, regulations and policies.

Safe People

To keep data safe, the only people who will be able to access data are people with a legitimate reason to do so. Whether they work for the health service, research or academia or another organisation that might access data, they must receive appropriate training and demonstrate that they have the technical skills needed before they are approved to access data. This could include completing data security awareness training or becoming an accredited researcher.

There should also always be an audit trail that records every time that personally identifiable data is viewed or used. For instance, in hospitals, there is an audit trail that records anytime someone accesses your patient records. This is sometimes difficult when data is shared offline, e.g. in the post, or when data is sent to another organisation on a spreadsheet.

Safe Projects

Getting approval to use patient data

Any request to use patient data should be assessed by an independent review committee, who check that the reason for using the data is appropriate. These committees usually have patients or members of the public on them, too.

Organisations that look after patient data will have a clear review process to ensure data is only used appropriately. There are three things that will be checked:

WHY is the data needed
WHO is accessing the data
HOW will the data be protected

What other checks are there?

Research applications will usually be reviewed by an expert research ethics committee to ensure that, in addition to reviewing the request for data, the project itself reaches established standards for ethics in research. There are extra controls to access personally identifiable information where it is not possible to ask consent.

In England and Wales, requests for the use of data for research purposes are reviewed by the Confidentiality Advisory Group (CAG), which also provides advice to the Secretary of State for non-research uses. NHS England is developing a new data advisory group to oversee data access arrangements and have an interim group in place.

In Scotland, data access requests are overseen by the Public Benefit and Privacy Panel for Health and Social Care.

In Northern Ireland, the Privacy Advisory Committee can advise on uses of patient data but has no statutory powers.

Strict legal contracts

A legal contract should be signed before data can be transferred or accessed. This sets out strict rules about what an organisation can do with the data, and has clear restrictions on what is not allowed.

Data protection impact assessments (DPIAs) are checks that are done to identify risks coming from sharing, or processing personal data, and minimise these risks as early as possible. This will likely be done before a data sharing, or processing, agreement (or contract) is put in place and inform what is required.

What does a data sharing contract include?

what data will be provided, and how
the purpose for which the data can be used
when and how data must be destroyed after use
the data security requirements that must be followed
what an organisation must not do with the data:
- data cannot be used in any way to re-identify an individual
- data cannot be linked with any other data, unless explicitly approved in the application
- data cannot be passed to any third parties, unless explicitly approved in the application
the organisation can be audited to check data is being used appropriately.

Safe Data

The best way to protect someone’s information is to remove details that identify a person and take further steps to ‘anonymise’ it. Anyone wanting to use patient data will only be given the minimum amount necessary to answer a question, though this can sometimes be difficult if the question is very broad or it isn’t known which data specifically will answer the question.

There are different levels of anonymisation, and it is often not possible to fully anonymise health data. Often, a process called ‘pseudonymisation’ is used, where a unique marker (often a random-looking string of letters and numbers) is used in place of identifying information such as name, address, and NHS number. This distinguishes different people’s data in the dataset but ensures their identities are not revealed. The identifying information is kept separate, and individuals can only be re-identified if there are legitimate, legal reasons to do so. This is different from ‘anonymisation’ where all identifying information has been removed and no identifier exists to potentially re-identify the data.

Take a look at the section on 'How my privacy is protected' in our guide to health datasets, to learn more about anonymisation.

What if it isn’t possible to anonymise the data?

If it is not possible to anonymise the data, there are strict controls on how personally identifiable data can be used and stored. It can only be used if you give your permission or where required by law, and then only with robust safeguards.

Safe Settings

All health data must be stored securely, with controlled access and robust IT systems to keep data safe. There are numerous safeguards in place, including the use of technology to protect data for example by restricting access (using passwords or swipe cards to control access to data), or using encryption so the data can only be read with a code.

IT systems must be kept up-to-date to protect against viruses and hacking. The NHS (and the Health and Social Care service in Northern Ireland) monitors threats and security incidences, and provides support to local organisations and providers to help keep computer systems safe.

In order to improve the safety of health data, health services, researchers, government organisations and academics have increasingly moved toward the use of Trusted Research Environments (TREs), sometimes also known as data Safe Havens or Secure Data Environments (SDEs). These systems are seen as a more secure way of accessing data, particularly for research. In these systems, data can only be accessed via centrally-controlled TREs/SDEs by authorised users, for approved projects. They can only see the data they need to for their work, and the data owner retains control over the data as they can control what data is being accessed, see what analysis is being undertaken, and prevents the data from being shared with anyone outside of the approved users or from data being reused for other projects or use cases.

In England, the Government committed in the Data Saves Lives Strategy to develop SDEs and to implement a phased move toward all NHS data being accessed through SDEs.

In Scotland, they are called Safe Havens and exist as a regional network and a National Safe Haven and include NHS Scotland data and non-health population level data.

In Northern Ireland, Health and Social Care hosts the Honest Broker Service (HBS) TRE which includes hospital and family practitioner data.

In Wales, Secure Anonymised Information Linkage (SAIL) Databank is the national TRE for Wales and includes NHS data as well as non-health data such as education and social care data.

There are also new techniques being used to improve privacy but support research, for example the use of Privacy Enhancing Technologies (PETs). Using PETs can help demonstrate a ‘data protection by design’ approach, and are technical solutions that protect people’s privacy.

Safe Outputs

‘Safe outputs’ is typically relevant when accessing data in TREs. In a TRE, data can only be accessed directly in the system and the data itself never leaves the TRE. Only the outputs of the analysis are exported. This is different from the previous way of doing things, where datasets were issued to approved users, which meant data owners (such as the NHS or HSC) had less control once it had left their hands.

Before the outputs can be exported, all analysis and outputs are checked by the data controller to ensure that the agreement has been followed, that the outputs cannot be used to identify anyone. Once reviewers are satisfied that the data user has followed the right protocols, the outputs will be released.

Safe outputs can also apply to checks that are performed on data that is shared or published, for example ‘statistical disclosure control’. When statistics are released at a detailed level (e.g., small numbers of people, small geographical areas, etc) the risk of being able to identify individuals is likely to be increased. Statistical disclosure control is a process to check that this risk remains extremely low, or put in place further measures to reduce it, such as using asterisks for numbers that are too low, putting the data into different groupings, rounding numbers up or down, etc. More information can be found here.

Examples and more information

The Office for National Statistics produced this blog on how it adopts the Five Safes framework in its work across all public sector data.

OpenSAFELY is a health data analytics platform that was created to deliver urgent analysis during the COVID-19 pandemic. This page explains how they applied the Five Safes framework to keep data safe.

This video explains Trusted Research Environments and the Five Safes.

Data ethics, governance and regulation

The ethical and social implications of using patient data need to be carefully thought through. To ensure data is used in socially just and equitable ways, the right kinds of regulations and good governance need to be in place.

Where advances in data science have the power to create valuable insights into health, illness and treatments, it may be ethically the right thing to use patient data. But there are also some significant ethical and social questions about how this data should be used, managed and protected.

It is particularly important to keep ethics and governance in mind as new data technologies emerge to ensure that regulation and oversight keeps pace with rapid innovation.

In recognition of the growing importance of data ethics, the Government has published a Data Ethics Framework (2020) which sets out key principles for the use of data in the public sector. It is aimed at those working in the public sector, who must work through the framework when starting any project. This is not specific to the health sector or health data. It has also published a Data Sharing Governance Framework that sets out principles to reduce or remove non-technical barriers to data sharing in the public sector.

In 2018, a government body – the Centre for Data Ethics and Innovation (CDEI) – was established and tasked with enabling the trustworthy use of data and AI. It sits within the Department for Science, Innovation and Technology and works with experts in data, public engagement, engineering and computer science to deliver approaches to data and AI governance that are informed by public attitudes.

This report from the Royal Society (2020) provides an overview of the current state of data governance in the UK.

Data ethics and governance in healthcare

Data ethics and governance regarding patient data is often considered especially important due to the sensitive nature of people’s health data. The ‘Data Saves Lives’ strategy (2022) in England included commitments to improve and simplify data and information governance, develop an information governance portal to provide a one-stop shop for guidance, and develop a national information governance transformation plan.

The Health Research Authority (England) and the Devolved Administrations also provide a Research Ethics Service so that research proposals relating to their area of responsibility can be reviewed by a Research Ethics Committee (REC). These protect the rights, safety, dignity and wellbeing of research participants. This applies to research projects using patient data. There are more than 80 NHS Research Ethics Committees across the UK.

Many requests to access patient data have to go through an application process, which includes governance steps. There are often panels that review these requests and make decisions. The panels usually contain ‘lay’ members to improve the governance and transparency. See our section on “Who Decides Access” for more information.

NHS England has set up an NHS AI Lab Ethics Initiative to embed ethical approaches to AI in health and care. They are also assessing the feasibility and merit of data stewardship models that could increase visibility of health data, transparency over its use, and empowerment of patients and the public in decisions about granting access to it for AI purposes.

The Ada Lovelace Institute also undertakes research and influences policy to ensure that data and AI work for people and society, including but not limited to in healthcare.

Nesta has an AI Governance Database which is an information resource about global governance activities relating to artificial intelligence.

The UK is hosting the first global summit on Artificial Intelligence safety in November 2023.

Links and further reading

Discussions of data ethics have important implications for how data is used in healthcare and biomedical research, and how this is regulated.

Here are some reports considering these issues:

Ethical, social, and political challenges of artificial intelligence in health (2018)

A Wellcome Trust and Future Advocacy report exploring the ethical, social and political challenges resulting from current and prospective uses of AI in healthcare and biomedical research.

Confronting Dr Robot: Creating a people-powered future for AI in health (2018)

A NESTA report looking at developing principles around which AI should be applied to healthcare, and policy recommendations to implement these principles.

The collection, linking and use of data in biomedical research and healthcare: ethical issues (2015)

Nuffield Council on Bioethics report on the changes predicted to take place in the data landscape and the governance and ethical implications of uses of data.

Print page