Skip to content

UK TRE Glossary

Term Tags Definition
AAI

Security Management

An abbreviation of "authentication and authorisation infrastructure", AAI refers to the technical mechanisms used to verify and manage users' access to computer systems.

See also: Access Control; Authentication; Authorisation.

Access Control

Security Management

The technical mechanism for controlling a known (authenticated) user’s access to a system and its underlying assets such as data. Access control is also referred to as authorisation (and shorthanded as “AuthZ” to distinguish it from authentication), as it determines what the user is authorised to do.

See also: AAI; Authentication; Authorisation.

Actor

Architecture

A person, organization, or system that has one or more roles that initiates or interacts with activities.

Example: The SATRE architecture needs actors such as data analysts and internal auditors.

See also: Role.

Administrative Data

Data in general

See: Administrative data 🔗.

Algorithm

Computing

A sequence of computational steps for processing data to achieve a particular outcome. Algorithms can range from the simple (add up a set of numbers) to the complex (use complicated mathematics to search for patterns in image data). Algorithms are usually described generally, as mathematics or in words, in contrast to computer programs which are written in specific computer languages.

Analysis

Computing

Also Data Analysis. Techniques that produce knowledge from organised information. Processes of inspecting, cleaning, transforming, and modelling data with the goal of highlighting useful information, suggesting conclusions and supporting decision making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

See also: Data Analysis 🔗.

Anonymisation

Identifiability

The process of making personally identifiable data anonymous so that individuals can no longer be identified. In contrast to Pseudonymisation, true anonymisation cannot be reversed.

See also: Pseudonymisation.

Application Deployment

Computing

The process of installing, configuring, and making software applications available for use within a given environment (eg, a Trusted Research Environment (TRE)).

Application Component

Architecture

An encapsulation of application functionality which is modular and replaceable.

Example: To perform work within a TRE a data analyst might need access to a Desktop or command line interface application component.

See also: Desktop; Command Line Interface (CLI).

Application Programming Interface (API)

Computing

A type of software interface that provides a way for two or more computer programs to communicate with each other. In contrast to a user interface, which connects a computer to a person, an application programming interface connects computers or pieces of software to each other.

Application Stack

Computing

A number of applications, tools and other software that work in concert to form a complete software solution.

Architecture

Architecture

An architecture defines the structures and behaviours of an organisation including people, processes, data and technology. This helps build a blueprint for how organisations and people work with technology to deliver TREs.

Architectural Principle

Architecture

Fundamental guidelines that inform the design, decision making and implementation of a TRE. These principles provide a framework to ensure that the design of the underlying components of a TRE are aligned to consistent goals, values and best practices.

See also: Trusted Research Environment (TRE).

Artificial Intelligence (AI)

Computing

A branch of computer science that aims to create technology and systems that perform tasks and make decisions in ways that resemble human intelligence. AI systems can be built in various ways, with the most common current method being Machine Learning.

Examples: A chess-playing computer program is an example of a specialised AI system (it can play chess, but nothing else). The programs inside a modern robot that can climb stairs and walk over uneven ground is an example of a more general AI system.

See also: Machine Learning (ML).

Asset Management Process

Management

A systematic approach to acquiring, operating, maintaining, and disposing of assets within an organisation, aimed at maximising their value and minimising risks.

Authentication

Security Management

The technical mechanism by which a computer user proves that they are who they say they are. Authentication is often shorthanded as “AuthN” to distinguish it from authorisation.

Example: The combination of a username and a password is a method of authentication.

See also: AAI; Access Control; Authorisation.

Authentication Application

Security Management

A software system that verifies and validates the identities of users or entities accessing a system through authentication.

See also: Authentication.

Authentication Token

Security Management

A piece of data used to authenticate the identity of a user or application to a computer system. Authentication tokens are often generated by authentication applications, and possession of a given token is evidence that the owner has successfully authenticated themselves to the system in question.

See also: Authentication; Authentication Application.

Automated Disclosure Control

Computing

Disclosure control (qv) without the intervention of a human being each time. Automated disclosure control aims to capture the necessary rules for ensuring a given dataset cannot be used to identify any individual in an automated software system.

See also: Disclosure Control.

Authorisation

Computing

Authorisation is a process of verifying that a person or other agent can legitimately take some action, such as gaining access to a dataset, editing a document, entering a building or making a payment. An administrative authority must determine whether there are sufficient grounds for authorising the action. Authorisation is often shortened to "AuthZ" to disntinguish it from authentication.

See also: AAI; Access Control; Authentication.

See also: Authorisation 🔗.

Best Practice

Processes

A set of guidelines that, if followed, is known to produce good outcomes. Best practice may be based on different levels of research evidence and/or collective experience.

Business Process

Architecture

A set of actions which produce a specific desired outcome.

Example: To access the TRE a data consumer needs to complete an onboarding business process.

See also: User Onboarding; Trusted Research Environment (TRE).

Big Data

Data in general

Large amounts of information that, because of its scale, may need novel or non-standard methods to process. In the original coining, "big" referred to one or more of volume (the raw size of the data), velocity (the rate at which new data were generated) or variety (the complexity or richness of the data).

Caldicott Guardian

Special aspects in the NHS Context

A senior professional in the NHS who safeguards patient confidentiality and privacy. They are responsible for protecting patient information, including how it is used in, for example, research. Named after Dame Fiona Caldicott, the first UK National Data Guardian.

Capability

Architecture

An ability that a system possesses. Capabilities are typically expressed in general and high-level terms. Achieving a capability typically requires a combination of organisation, people, processes, and technology.

See also: Capability Decomposition.

Capability Decomposition

Architecture

A set of components that realise a capability. These components will vary depending on the nature of the capability. Business-focused capabilities will be realised by business processes, roles and services. Technology-focused capabilities will be realised by applications, services and interfaces. In addition to the components realising the capability, a catalogue of standards, frameworks and controls linked to the capabilities will provide guidance on how to implement the capabilities safely.

See also: Capability; Component; Business Process; Role; Application Component.

Census

Data in general

A survey of a national population which asks questions about age, gender, background and so on. In the UK, censuses are carried out every 10 years or so. Census information helps with things like local service planning and making important decisions. Census data can be used in academic research. If so, it is anonymised before being used.

See also: Anonymisation.

Characteristic

Data in general

A piece of information about an individual, place or thing that is potentially useful in data analysis. For example, characteristics of a person might be age, gender, ethnicity, socioeconomic status and education level. If data about individuals were recorded in a table, the columns of the table might be characteristics.

See also: Socio-demographic Factors.

Chief Investigator (CI)

Running and overseeing research

See: Principal Investigator (PI).

Clinical Trial

Health Research

A research study conducted to test a new treatment, like a medicine or other therapy. When it comes to testing medicines, clinical trials are known as Clinical Trials of Investigational Medicinal Products (CTIMPs), and they have additional special rules and regulations that need to be followed. These rules ensure the safety and effectiveness of the new treatment being tested before it can be made available to the general public and the safety of the people participating in the trials.

Clinical/ Medical/ Health Data or Healthcare Data

Health Services & Health Data

A person's information about their health or day-to-day health care. This information is collected as people see healthcare professionals, or have tests and treatments as part of their care. It is stored in electronic health records (EHRs) used by the NHS.

See also: Sensitive Data.

Cloud Computing

Computing

A model of computer access or provision where users rent computer power remotely, rather than buying and installing their own hardware locally. Cloud computing may be described as "public cloud", meaning available to anyone from a wide number of cloud computing companies, or as "private cloud" or "on-premises" (or "on-prem") cloud, meaning installed and provided privately by, for example, a firm for its own uses.

Examples: Microsoft Azure, Amazon Web Services (AWS) and Google Cloud Platform (GCP) are large, public cloud providers.

Cloud Storage

Computing

Computer data storage hosted by a cloud computing firm rather than provided locally. Access to cloud storage requires an Internet connection, in contrast to local storage which is either attached to a user's computer or needs only a local network connection.

Examples: Apple's iCloud storage, Google's Drive or Microsoft OneDrive are examples of cloud storage.

Component

Architecture

The statements concerning processes, controls, practices and applications that make up a capability, together with an importance label.

See also: Capability; Capability Decomposition.

Code Control

Computing

The management and oversight of software code (programs) or source files, including versioning, change tracking, access control and collaboration.

Contrast with: Code Lists.

Code Lists

Data in general

A collection of specific, standard codes (labels) that are used in healthcare to represent different things, such as medical diagnoses, treatments, or procedures.

Contrast with: Code Control.

Command Line Interface (CLI)

Computing

A text-based interface or environment that allows users to interact with a computer or software by typing commands or instructions, in contrast to a graphical user interface.

See also: Graphical User Interface (GUI).

Common Workflow Language (CWL)

Computing

An open standard for describing how to run software tools using command line interfaces, and how to chain them together to create workflows.

See also: Command Line Interface (CLI); Workflow.

Compliance Checking

Security Management

Related to: Compliance 🔗.

Processes

Consent is defined as individual providing freely given, specific, informed and unambiguous indication of their wishes to provide their data for processing relating to him or her.

Consent within the context of data protection regulation is one of the grounds (lawful bases) for lawfully processing personal data in relation to an individual and is specific towards particlar activities.

Research consent is the process of documenting an individual's choice to be involved in a research project(s) and typically called informed consent - this conveys that there is a process to allow participants to make a meaningful choiceInformed consent" is used to emphasise that understanding is crucial before agreeing, and typically applies when sharing personal data or participating in research studies. Research consent is commonly required for participation in clinical trials/research.

Broad consent is a mechanism of gaining the consent of an individual who donates their biosamples and health data with a view to their future use in research, and may not be specific to a particular research project at the time of collection.

Assent is the process of providing approval for data processing/involvement in research by an individual who is not legally eligible to do so (e.g. a child under the age of 16), and will be supported by an adult providing legal consent.

Withdrawal of consent is both a legal and ethical right of the individual whose data is being processed, and must be respected in reference to data protection and research compliance. It allows an infividual to discontinue/rescind access to his/her data and prevent further processing.

See also: Unconsented Data.

Controls

Security Management

In computer security management, measures, safeguards or mechanisms implemented to manage or mitigate risks and ensure the integrity, confidentiality, availability, and reliability of systems, processes, or data.

Data

Data in general

See: Data 🔗 and Data 🔗 and Data 🔗.

Data Archiving

Data Management

The practice of securely storing and preserving data in a read-only format for long-term retention, typically for compliance, historical reference, or reproducibility.

See also: Data Management 🔗.

Data Classification

Data Management

The categorisation or labelling of data based on its sensitivity, risk, value, or other attributes, often used to determine appropriate handling, storage, and security controls.

Data Controller

UK law and rules

A data controller is a person or organisation who decides how personal data, which is information about identifiable individuals, is used or handled. Examples of data controllers include NHS organisations like Trusts and GP surgeries, or devolved government bodies. In the UK, most organisations handling personal data must register with the ICO (Information Commissioner's Office), and their details are public. Data controllers are legally responsible for how data is managed. They must prevent misuse, report breaches, and can be fined for failing to meet these duties.

See also: Data Processor; Information Commissioner's Office (ICO).

See also: Data Controller 🔗.

See also: Data Controller 🔗.

See also: Data Controller 🔗.

Data Curation

Data in general

See: Data Curation 🔗.

See: Data Curation 🔗.

See: Data Curation 🔗.

Data Custodian

Data Management

The person, organisation or other entity responsible for the data. They should control access to the data and protect the use of it and sharing of it (or subsets of it) to ensure regulations appropriate to the type of data are followed . This includes ensuring no private data is disclosed when it shouldn’t be.

Data Deletion

Data Management

The process of permanently removing or erasing data from storage systems or devices to ensure that it cannot be recovered or accessed.

Data Discovery

Processes

The process of identifying and accessing relevant data sources for research or analysis.

Data Egress

Data Management

The movement or transfer of data to locations outside of a TRE, either through manual or automated process. Data moved in this way are often known as data outputs.

Data Governance

Processes

Policies, procedures, and regulations that govern the collection, storage, access, and use of data to ensure privacy, security, and ethical considerations are addressed.

Data Ingress

Data Management

The movement or transfer of data to infrastructure inside of a TRE either through manual or automated process. Data moved in this way are often known as data inputs.

Data Lifecycle Control

Data Management

The management and oversight of data throughout its lifecycle, including storage, usage, sharing, retention, and eventual disposal.

Data Object

Architecture

A store of data or information.

Example: To know what data is stored within the TRE a study database data object is needed. This contains information on the data assets within the TRE, who owns them and other compliance information.

See also: Database; Trusted Research Environment (TRE).

Data Literacy

Data in general

The ability to understand, analyse, interpret, and critically evaluate data and data related studies.

Data Minimisation

Data Management

See: Data Minimisation 🔗.

Data Mining

Data in general

See: Data Mining 🔗.

See: Data Mining 🔗.

Data Pooling

Data Management

See: Data Pooling 🔗.

See also: Federated Analytics; Federated Data.

Data Processor

UK law and rules

An entity that processes personal data on behalf of a data controller, following the controller's instructions. They do not have control over how the data is used and are only allowed to perform tasks as directed by the controller. For example, a company hired to manage an email service for another organization acts as a data processor. The processor cannot use the data for any other purposes, such as marketing, without the controller's consent.

See also: Data Controller; Information Commissioner's Office (ICO).

Data Protection Act (DPA)

UK law and rules

UK law that regulates how personal data—information that can identify living individuals—is collected, used, and stored. It provides rules for organizations on data handling, ensuring privacy and security, while giving individuals rights to access, correct, and control their own data. It implemented UK-specific aspects of the GDPR and superseded previous UK legislation.

See also: UK General Data Protection Regulation (UK GDPR).

Data Protection Impact Assessment

UK law and rules

A process used to identify and minimize risks to personal data before it is collected or processed. It evaluates how data use might impact individuals' privacy and outlines steps to protect their information. A DPIA helps ensure that data handling practices are safe and secure, functioning like a risk assessment for personal data.

Data Protection Officer (DPO)

UK law and rules

A professional responsible for ensuring that organizations comply with data protection laws when handling personal data. They advise on data privacy practices, monitor compliance, and act as a point of contact for data protection authorities. Organizations processing large amounts of personal data or those in the public sector are required to appoint a DPO, and they are listed on the public register held by the Information Commissioner's Office (ICO).

Data Science

Data in general

A field of analysis focused on extracting knowledge and insights from data. It combines techniques from data management, computer science, and statistics to store, organize, and analyze data. Data science also involves applying this knowledge to specific problems, making it highly interdisciplinary, with experts from various backgrounds (such as clinicians and computer scientists) collaborating. Its goal is to uncover useful patterns and make data-driven decisions or predictions.

Data Subject

UK law and rules

See: Data Subject 🔗.

Data Transfer Agreement

UK law and rules

An agreement or contract between a data controller and another organisation (such as a data processor), governing the transfer of data.

See also: Data Controller; Data Processor.

See also: Data Transfer 🔗.

Data Transfer Service

Data Management

A service or system that facilitates the secure and efficient transfer of data between different systems, networks, or locations.

See also: Data Transfer 🔗.

See also: Data Transfer 🔗.

Data Users

Data in general

See: Data User 🔗.

Database

Data in general

See also: Relational Database.

See: Database 🔗.

See: Database 🔗.

De-identification

Identifiability

See: De-identification 🔗.

See: De-identification 🔗.

Demilitarized Zone (DMZ)

Computing

A physical or logical subnetwork that separates an internal TRE network from untrusted external networks, such as the internet. The DMZ provides limited access to internal networks based on trust.

see also: zone

Desktop

Computing

The Graphical User Interface (GUI) and environment presented to users on their computer screens, typically including icons, menus, and windows for interacting with applications and files.

Desktop Applications

Computing

Software applications designed to be installed and run on individual computers or Desktop systems, often providing specific functionalities or tools.

Disclosure Control

Identifiability

The process of review by approved staff at a Trusted Research Environment (TRE) of any research or analysis results prior to their release from the TRE. The aim of disclosure control is to ensure there are no risks of identifying individuals in any released research results.

See also: Egress/Ingress Control.

Related to: Disclosure Control Methods 🔗.

Related to: Disclosure Check 🔗.

Egress/Ingress Control

Security Management

The implementation of measures or controls to control and monitor the movement of data into and out of the TRE, to prevent sensitive data from leaving the TRE. Often known as output/input checking, or in the case of egress, disclosure control.

See also: Disclosure Control.

Electronic Health Record (EHR)

Health Services & Health Data

A person’s health records that are held digitally on a computer (as opposed to on paper). Also known as an electronic patient record (EPR).

Ethical Approvals

Running and overseeing research

Agreement from a group of experts that a piece of research is done in a proper and respectful way. Ethical approvals ensure that participants' rights are protected and all the research is conducted responsibly.

European Union (EU) General Data Protection Regulation (GDPR)

UK law and rules

The 2016 GDPR set out the EU framework for the handling of data relating to identifiable living people. Among many other things, it sets out a variety of legal bases for using personal data, such as “the data subject has given consent”, “a task... in the public interest”, or for “scientific... research”. The UK Data Protection Act (DPA) was framed in its terms and set out UK-specific aspects. When the UK left the EU in 2020, the GDPR remained in UK law as the “frozen GDPR” or “UK GDPR”.

See also: UK General Data Protection Regulation (UK GDPR); Data Protection Act (DPA).

External Audit

Management

An independent assessment or review of the TRE organisation's controls, processes, or compliance conducted by external auditors or audit firms.

FAIR Data

Data in general

FAIR data is a set of principles ensuring data is:

Findable: Easy to locate through clear identification and metadata.

Accessible: Retrievable through standard methods, even if authentication is needed.

Interoperable: Can work across different systems and with other datasets.

Reusable: Well-documented and properly licensed so others can use it.

See also: FAIR Data 🔗.

See also: FAIR 🔗.

Federation

Processes

A grouping of organisations with their own policies and assets (e.g. datasets or computing resources) who agree to allow use of those assets by the broader group but without the assets leaving control or ownership of the organisation.

Ordinary real-world examples of this are the United States of America, Germany or Australia, where member states have individual laws and governance but also subscribe to central policies to enable and encourage working together. 

Federated Analytics

Computing

A form of analytics where data analysis happens across multiple independent organisations, with each organisation keeping complete control of their own data. Instead of combining all data in one place, the analysis program or Workflow is sent to each organisation's data. For example, multiple hospitals could participate in medical research by running the same analysis on their local patient records, then sharing only the summarized statistical results. The raw patient data would never leave each hospital, but researchers could still draw insights from the combined statistical findings across all participating hospitals.

See also: Data Pooling; Federated Data.

Federated Data

Computing

A model of data access in which different organizations keep full control of their own data but agree on ways to safely share access to it for specific purposes. Each organization maintains its own data security and rules, but allows approved users to work with the data through agreed-upon tools and systems. For example, research institutions might share access to their datasets while keeping the data within their own secure environments, allowing collaborative research without moving sensitive data to a central location.

See also: Data Pooling; Federated Analytics.

Federated Identity Mapping

Computing

See: Federated Identity 🔗.

Federated Learning

Computing

See: Federated Learning 🔗.

Federation Operator

Management

An organization or entity responsible for managing a federated identity system or network. In such systems, multiple independent organizations (known as "federated members") collaborate to enable secure, streamlined access to resources or services without requiring users to maintain separate credentials for each participating member.

Federated Query

Computing

See: Federated Analytics.

Firewall

Security Management

A security device—either hardware, software, or a combination of both—that monitors and controls incoming and outgoing network traffic based on predetermined security rules. Its primary purpose is to establish a barrier between a trusted, secure internal network and untrusted external networks, like the internet, to protect against unauthorized access, cyberattacks, and other potential threats.

See also: Firewall 🔗.

Five Safes

Processes

The Five Safes framework is a set of principles developed to guide researchers and organizations in handling sensitive data.

See also: Five Safes 🔗.

Graphical User Interface (GUI)

Computing

A way of interacting with a computer system based on visual presentations of documents, applications and so on as windows on a screen. Users interact with a GUI using pointing devices (e.g. a mouse or a finger) rather than having to type everything.

Contrast with: Command Line Interface (CLI).

Identifiable Data

Identifiability

Data that can be used to identify, contact, or locate a specific individual, either by itself or when combined with other available information. This includes direct identifiers like full names, NHS numbers, and email addresses; indirect identifiers such as date of birth or workplace that could identify someone when combined; and context-dependent identifiers like IP addresses or device IDs. For example, while a person's age alone might not identify them, combining it with their job title and city of residence could make them identifiable – such as "a 45-year-old pediatric surgeon in Bolton, Greater Manchester" might be specific enough to identify a particular individual, even without naming them directly. This type of data requires special handling under various privacy regulations like GDPR to protect individuals' privacy and prevent unauthorized access or misuse.

See also: Direct Identifier 🔗.

See also: Indirect Identifier 🔗.

See also: Identifiable Data 🔗.

Identity and Access Management Services

Security Management

See: Identity Management 🔗.

Identity Verification

Security Management

The process of confirming or authenticating the identity of individuals or entities, often through the verification of personal information, credentials, or biometric data.

Information Asset Owner

Data Management

An individual or role accountable for managing and overseeing an information asset, including their acquisition, use, maintenance, and protection.

Information Commissioner's Office (ICO)

UK law and rules

The UK’s independent authority for upholding information rights in the interest of the public.

The ICO oversees the application of the Data Protection Act and the UK GDPR, and has the power to issue monetary pentalties for infringement of dat protection legislation.

Information Governance (IG)

UK law and rules

How an organisation takes care of its information or data. It involves strategies and processes for defining, collecting, storing, securing, using, protecting and disposing of data safely, while also respecting privacy. IG ensures that data is managed well throughout its life cycle, following guidelines and laws. It helps organisations handle data responsibly, protect it from risks, and use it in a way that follows rules and keeps people's information safe.

IG also identifies the processes to be followed in the event of a failure to protect personal data, and any reporting, or escalation to regulatory bodies that might be required.

Internal Audit

Management

An independent evaluation process performed within the TRE organisation that assesses and improves its internal controls, risk management, and governance.

Interoperability

Computing

The ability of two or more systems, devices, or applications to exchange and use information seamlessly. Interoperability enables these systems to work together, often through the adoption of open standards, that facilitate consistent communication and data sharing without requiring custom intergration. Good interoperability promotes collaboration, scalability, and the extension of services by allowing different systems to work together in a standarised, vendor-neutral way, thereby reducing techinal and operational barriers.

See also: Interoperability 🔗.

Issue Management Process

Computing

A systematic approach to identifying, tracking, resolving, and managing issues or problems that arise within a TRE organisation, aiming to minimise their impact and ensure timely resolution.

Common mechanisms to manage the effective resolution of such issues can include Corrective and Preventive Actions (CAPAs), which enable such instances to be documented and provide an audit trail of activities undertaken to prevent recurrence.

IT Service Provider

Computing

A company, department, or entity that delivers information technology services or support to internal or external clients, such as network management, software development, or helpdesk support.

Lawful Basis

UK law and rules

Under UK data protection law and the UK GDPR, organisations must have a defined lawful basis to hold and use "personal data". The Health Research Authority (HRA) and Information Commissioner’s Office (ICO) advise that for almost all research conducted in the UK organisations should rely on either: (1) ‘Task in public interest’ – for all public bodies (NHS / HSC, Universities, UKRI etc), or (2) ‘Legitimate interest’ – for non-public bodies (charities etc.)

See also: Lawful Basis 🔗.

Linkage of data (data linkage)

Processes

Joining two or more sets of data together using one (or more) pieces of information common to all (often called "common keys"). Linkage may be based on straightforward rules (“two records with the same NHS number are from the same person”) or based on probability (“if two records share the same forename, surname, and date of birth, they are more likely to be from the same person”). Links may be made using identifiable data (e.g. NHS number) or de-identified data (e.g. a research pseudonym).

For example: joining a health dataset with an employment dataset using a common key based on individual names and addresses.

Longitudinal Dataset

Data in general

A collection of data related to the same group of people over a long time to see how things change. This may involve asking the same questions at different ages.

Machine Learning (ML)

Analysis

A computer programming technique particularly suited to identifying patterns or rules in large amounts of data. Rather than beginning with a fixed set of rules, an ML program builds up ("learns") a set of likely rules by processing many example datasets (this stage of ML is known as "training"). When the set of likely rules is complete, the ML program can apply them to new datasets and offer a likely prediction (this stage is known as "inference").

For example: an ML program trained to recognise car numberplates would be trained on many pictures of car numberplates, building up a set of likely rules that will enable to program to "recognise" car numberplates in the future.

See also: Machine Learning (ML) 🔗.

Malware Scanning Application

Security Management

A software application or tool that scans and detects malicious software or malware on computer systems or networks, aiming to prevent security breaches or infections.

Metadata

Data in general

Data that describes or provides information about other data. It is used to provide context, meaning, and structure to data, and helps to make it easier to understand and use. Metadata can describe various aspects of data, such as its content, format, structure, origin, quality, and usage.

Minimum Viable Product (MVP)

Computing

A version of a product or service that has the minimum set of features and functionality required to meet the needs of early adopters or customers. The goal of an MVP is to quickly validate the product idea and test the market demand, while minimizing development costs and time-to-market.

Monitoring

Management

The continuous or periodic observation, measurement, or tracking of systems, processes, activities, or events to ensure compliance, performance, or security.

National Data Guardian

Special aspects in the NHS Context

The National Data Guardian (NDG) for Health and Social Care is an independent champion for patients and the public when it comes to matters of their confidential health and social care data, and appointed by the Secretary of State for Health and Social Care by statute . To support the development and maintenance of trustworthy systems and practices, the NDG provide advice, encouragement, and challenge to the health and social care system on the safe, appropriate, and ethical use of people’s confidential health and care information.

The NDG advise the UK government and NHS on the processing of health and adult social care data in England. Both the Caldicott Guardian and the National Data Guardian protect patient information. The Caldicott Guardian focuses on data protection within individual healthcare organisations.

See also: Caldicott Guardian.

See also: UK Government National Data Guardian 🔗

National Data Opt-Out (NDO)

Special aspects in the NHS Context

The NHS National Data Opt-Out in England and Wales allows individuals to say 'no' to sharing their personal information for things like research without asking them first. This comes from the NHS Act Section 251 and the requirements outlined in the UK GDPR and Data Protection Act. By default, patients are included in the system. But if someone doesn't want their private information to be shared, they can choose to opt-out using the National Data Opt-out; their personal information remains exclusively for their medical care.

Natural Language Processing (NLP)

Analysis

A field of artificial intelligence (AI) that enables computer software to analyse, interpret, and generate human language. NLP allows machines to extract meaningful information from text, identify key details, uncover patterns, and detect trends within large volumes of text data, but faces challenges because words can have different meanings depending on their context, and the software cannot understand emotions or the intentions behind why certain words were chosen. Examples of NLP in the TRE space include: identifying symptoms, diagnoses and treatments in electronic health records; identifying references to mental health concerns such as suicidial thoughts, self-harm, or changes in mode from clinicians notes; and programs to automatically identify patients based on eligibility criteria for research studies or clinical trials, employment history or status over time, or students' academic progression over time.

On-premises

Computing

Also "on-prem". See Cloud Computing.

Opt In

Health Research

An active choice, made by a participant or individual to be involved "in" research or provide their data for a research project. This is not a passive action, and cannot include individuals who have automatically been included.

Opt Out

Health Research

An active choice, made by a participant or individual to not be be involved in research or provide their data for a research project. This can include idividuals who choose to not be included, or withdraw their consent to their data being included and are therefore excluded or "out" of any analyses of the data.

Orchestration Zone (OZ)

Computing

The zone managing the deployment and maintenance of infrastructure and the configuration of the TRE. This zone contains no research data and is not be accessible to any researcher/project role. Infrastructure management roles operate within this zone.

See also: zone.

Patient and Public Engagement (PPE)

Health Research

A purposeful set of activities designed to promote an ongoing two-way dialogue with the public about data and research, driven by active listening and responding.

Example: A researcher attending science festivals to enhance the public’s understanding of a specific topic through engaging and interactive activities.

See also: Patient and Public Involvement (PPI).

Patient and Public Involvement (PPI)

Health Research

A process by which patients and the public are included in the decision-making process within a piece of work or research. By providing their own insights and advice from personal experience they can offer unique and valuable perspectives throughout the planning, development and implementation stages.

See also: Patient and Public Engagement (PPE).

Peer Review

Health Research

A thorough evaluation process to ensure the quality and validity of scientific studies before they are shared publicly. Reviewers assess the research by looking at things like the methods used and whether the conclusions are supported by the results. They may suggest changes before recommending publication, or they may advise against publishing.

Personal Data

Data in general

UK data protection regulation defines personal data as any piece of information that someone can use to identify, with some degree of accuracy, a living person. It is also something which can confirm your physical presence somewhere.

Examples of personal data would be: a name and surname; a home address; an email address; an identification card number; location data; an Internet Protocol (IP) address; the advertising identifier of your phone.

Personal data can also be sensitive (or "Special Category Data"): see Sensitive Data.

Principal Investigator (PI)

Running and overseeing research

The researcher in charge of a project or study at a particular site (e.g. hospital or university). The PI is responsible for overseeing the study's progress, coordinating with the team members involved, and ensuring that the research is conducted according to the approved plan. The PI plays a crucial role in managing the study.

Private Cloud

Computing

See Cloud Computing.

Pseudonymisation

Identifiability

The replacement of direct identifiers within a dataset with pseudonyms so that the data no longer directly identifies individuals. In contrast to Anonymisation, pseudonymisation provides the option of reinstating the original identifiers should they be needed and also allows for the linking of datasets through the creation of common pseudonyms.

See: Pseudonymisation 🔗.

See also: Anonymisation.

See also: Anonymisation and pseudonymisation 🔗.

Public Benefit

Health Research

Also known as "public good", research or activity which is motivated by its benefit to society. This work often aims to provide evidence for public policies, services or decisions to ultimately improve lives.

Example: health data research to learn more about the causes, characteristics, or effects of a disease or condition, and how to best treat it, to improve health and care of patients and the public.

Public Cloud

Computing

See Cloud Computing.

Public Dissemination

Processes

Communicating the findings of a research project or project information with the general public.

Qualitative Analysis

Analysis

Analysis without numbers means studying information based on qualities rather than quantities. Instead of focusing on numbers and statistics, this type of analysis looks at themes. It often involves interpretation and exploration, trying to understand the meaning behind the information.

For example: Qualitative analysis would be the best way to process interviews with people in which their perspectives and experiences were recorded.

Quantitative Analysis

Analysis

Analysis using numbers means studying data by focusing on quantities and measurements. This involves using mathematical and statistical methods to analyse and interpret the information. Researchers look at numerical values, such as counts, percentages, averages, or correlations, to gain insights and draw conclusions from the data. This type of analysis allows for objective and quantitative assessment of trends, patterns, and relationships within the data.

Query Management Zone (QMZ)

Computing

The zone handling queries sent to the TRE from other, remote TREs or external Job Submission services. Typically it sits alongside a Research Analytics Zone (RAZ) and provides different methods of access to approved research-ready datasets stored within the Secure Data Zone (SDZ) .

See also: zone.

Registry 

Computing

A centralised database, repository, or system that stores and manages information, configurations, or records related to specific entities, such as users, systems, or resources.

Relational Database

Data in general

An organised collectiom of data, where data are related to each other in a systematic manner so that they can be reorganised and accessed in a number of different ways. A relational database may house one or many datasets.

See also: Database.

See also: Database 🔗.

Research Analytics Zone (RAZ)

Computing

The zone providing the means for a researcher to gain direct access to the data their project is approved to use, in an environment suitable for the analyses their research requires. This is often realised as a virtual desktop environment, a computational notebook or similar. There is often a strict requirement that project environments be completely isolated from one another.

See also: zone.

Research Approvals

Running and overseeing research

All research that involves data from individuals must get approval from an authorised body. For research with NHS data, for example, this would be the NHS Research Ethics Committee (REC). Approvals committees often include both researchers and members of the public, and their job is to make sure that the research is planned and conducted in a fair and ethical way and that it benefits the public.

Researcher(s)

Other

Individuals or groups who utilise and analyse data for research purposes or as part of their work, such as scientists, analysts, or other professionals.

Risk Assessment

Risk Management

The systematic evaluation and analysis of potential risks, threats, or vulnerabilities, including their likelihood, potential impact, and the effectiveness of existing controls or mitigation measures.

See also: Risk Assessment 🔗.

Role

Architecture

A role is a set of connected behaviors, rights, obligations and norms within a TRE system. Roles are occupied by individuals, who are called actors.

See also: Actor; Trusted Research Environment (TRE).

Routinely Collected Data

Health Services & Health Data

Data, often about people, collected by health, social, or school services during their everyday tasks, like doctor visits or school days. This is also known as "routinely collected data" or "real-world data."

This data is not specifically gathered for research purposes.

For example routinely collected health data includes details about a patient's medical history, diagnoses, treatments, medications, etc.

Secure Data Zone (SDZ)

Computing

Zone supporting the management, linkage, curation and provision of research-ready sensitive datasets. Governance roles and Data Managers operate in the SDZ.

See also: zone.

Sensitive Data

Data in general

UK Data Protection Regulaiton (UK GDPR) defines sensitive data as Special Category Data and is subject to specific processing conditions under the UK GDPR: personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs; trade-union membership; genetic data, biometric data processed solely to identify a human being; health-related data; data concerning a person’s sex life or sexual orientation.

Commercial data such as retail information, business details, IP (intellectual property) and Copyright information or confidential product details is also be considered sensitive data.

Data sensitivity is be classified at an institutional level within policy documents (e.g. highly confidential, confidential, not classified) with handling requirements and placed on the different levels of confidetiality required. See Personal Data

Socio-demographic Factors

Data in general

Characteristics of individuals or populations related to social and demographic aspects such as age, gender, ethnicity, socioeconomic status, and education level.

See also: Characteristic.

Specification Pillar

Architecture

A specification pillar is a group of related capabilities. SATRE has four specification pillars: Information governance, Computing technology, Data management and Supporting Capabilities.

See also: Capability.

Structured Query Language (SQL)

Computing

A computer programming language designed to organise and work with structured data stored in databases. It allows people to find and use data from databases easily.

See also: Structured Data.

Structured Data

Data in general

Data which are organised and formatted using pre-defined rules, so that computational analysis is easier. For example, structured data is often stored as tables in a database where each column represents a different type of information (like numbers or words), and each cell in the table holds a single piece of data. This organisation helps with sorting, searching, and understanding the data more easily.

See also: Unstructured Data.

Study Closure

Research Management

The formal conclusion of a research study or project, including final data analysis, reporting, documentation, and archiving.

Study Onboarding

Research Management

The process of onboarding or initiating a research study, including setting up necessary infrastructure, obtaining approvals, and defining protocols or methodologies.

Study Register

Research Management

A centralised record or database that tracks and manages information about research studies or projects.

Supplier Management and Monitoring process

Management

A structured approach to managing and monitoring relationships with external suppliers, vendors and contractors, including selection, contract management and compliance oversight.

Technology Stack

Computing

The set of technologies (such as programming languages) that work together to implement a software solution.

Text Analytics

Computing

The process of examining and understanding written information, like electronic health records or other text-based content, to find important and useful insights. It involves analysing the text to identify patterns, trends, or valuable information that can be used for various purposes, such as research or decision-making.

Trusted Research Environment (TRE)

Processes

A class of computer systems which enable researchers to access sensitive datasets across administrative boundaries whilst ensuring that overall control of the data stays with appropriate governance authorities. TREs include Secure Data Environments (SDEs) in the National Health Service in England, Safe Havens in Scotland, processing environments as defined in the Digital Economy Act 2017 (DEA) and Secure Processing Environments as defined in European Health Data Space legislation. TREs are typically operated according to information governance practices and processes modelled on the Five Safes approach developed by the Office for National Statistics (ONS).

TRE Infrastructure

Computing

The set of computing resources used to implement and support a TRE. This may include desktop computers, databases, networking devices, firewalls etc. These resources may be physical (hardware owned by the TRE) or virtual (e.g. resources operated by a cloud provider).

UK General Data Protection Regulation (UK GDPR)

UK law and rules

The UK version of the European Union (EU) General Data Protection Regulation (GDPR) as recorded in the UK Data Protection Act (DPA).

In most cases, references in the UK to "GDPR" are likely to mean the UK GDPR, although the two versions are largely the same.

See also: Data Protection Act (DPA); European Union (EU) General Data Protection Regulation (GDPR).

Unconsented Data

Data in general

Personal data used for secondary purposes (such as research) where a specific, demonstrated public benefit is proven, usually with Article 6 and Article 9 in the General Data Protection Regulations as the legal basis for undertaking that secondary use of that data (as opposed to individual consent).

See also: Consent; UK General Data Protection Regulation (UK GDPR).

Unstructured Data

Data in general

Data that has limited structure, or structure that is very difficult to process computationally. Examples of such unstructured data include free text, like paragraphs of written information, or images such as X-ray or scan pictures, or scanned letters.

See also: Structured Data.

User Documentation

Computing

Written materials, guides, manuals, or instructions to assist users. Documentation typically includes information on features, step-by-step procedures and best practices to make it easier for Users to work. An example of this would be user manuals, quick start guides or troubleshooting sections on websites.

User Interface (UI)

Computing

See Graphical User Interface (GUI); Command Line Interface (CLI).

User Onboarding

Computing

The process of introducing and integrating users into an organisations systems and processes. It helps people understand features and learn how to use something effectively. For example, when you download a new app, there is often a step-by-step tutorial on how to make the most of the software.

Variable

Data in general

Any characteristic, number, or quantity that is represented in a dataset for each observation. In data analysis,a variable is a symbolic name to represent different types of information in datasets. For example, date of birth is a variable representing when a person was born.

See also: Characteristic.

Workflow

Computing

Specifically a computational workflow is a set of chained operations used to carry out a particular analysis or other computational task. Workflows simplify complex sequences of activities and enable researchers to automate and track the provenance of the work in workflow execution. Workflows can often be visualised as a network or tree of operations.

Zone

Computing

A distinct area within a TRE that has specific security, access, or functional characteristics. Zones require different levels of governance and approval for the roles accessing them, and in particular, movement of data between them should be subject to appropriate controls to manage the related disclosure risks.

see also: Orchestration Zone (OZ).

see also: Query Management Zone (QMZ).

see also: Research Analytics Zone (RAZ).

see also: Secure Data Zone (SDZ).