Thursday, May 21, 2026

Inside FDP – part 4: The NHS data model

This is the fourth article in a five-part series on what FDP is for. Part 1 described how the NHS data architecture accumulated and named eight interconnected problems. Part 2 defined the seven Frontline-First dimensions and how FDP delivers them. Part 3 described the ontology, object types and actions that make FDP structurally different. This post is about how Frontline-First scales beyond a single Trust, and why the combination of a shared data model and the consistent products built against it is the most important asset in the entire programme.

The NHS has had data standards for decades. National submissions enforce schemas, coding frameworks define vocabularies, and the data is still inconsistent. The reason is that the meaning of data is not determined only by the model that describes it – it is determined by the product that captures it, and its fit to the clinical process it supports.

The problem is not just within individual organisations – it is across the whole NHS. Two hundred and twenty Trusts, each running differently configured systems, each producing data that uses the same labels but carries different meanings.

A referral in one Trust may not mean the same thing as a referral in another, even though both pass the same validation rules. Nobody using the data downstream can tell. National submissions exist to impose consistency after the fact, but they are a sticking plaster on a structural problem. The data was not consistent when it was created – no amount of downstream processing can fully restore what was lost at the point of entry.

Two teams using the same model, the same codes, and the same electronic patient record (EPR) system can produce data that means fundamentally different things. The model was faithfully implemented but the data diverged because the product allowed it.

The Canonical Data Model (CDM) is the standard data model that FDP uses to define how NHS data is structured, labelled, and connected across every Trust in the country. It is what makes data produced in Dorset comparable with data produced in Newcastle.

The CDM is essential, but it only works if the products that capture data against it constrain recording to produce data of known meaning. The data model ensures the fields and relationships are right – the consistent product ensures the meaning is right.

Without both, the NHS will continue to produce data that looks consistent at the model level but diverges at the point of capture. The combination of the two is more important than the supplier, more important than the platform technology, and more important than any individual product.

What the CDM is

Every analyst who has worked at scale in the NHS has at some point fantasised about a world where data means the same thing in every Trust – where a patient on a waiting list in Newcastle matches a patient on a waiting list in Brighton without hand-curated lookups and without “well, when they say discharge they actually mean…”

The Canonical Data Model is a single, standard data model that defines how NHS data is structured, labelled, and connected across every Trust instance
Tom Bartlett

The entire edifice of national submissions (- the Secondary Uses Service, the Mental Health Services Data Set, the Community Services Data Set, the Emergency Care Data Set, and others – exists precisely because the data does not naturally line up. These are reconstruction jobs, performed monthly, on data that should have been consistent in the first place.

The CDM is a single, standard data model that defines how NHS data is structured, labelled, and connected across every Trust instance. It is what makes a product built in Dorset installable in Newcastle without rewriting code. It is what makes AI able to traverse the data and understand what a referral means, what a waiting list is, and how they relate. It is what makes the Frontline-First approach work nationally rather than just locally.

Why standards alone have not produced consistency

To understand why the CDM matters, it helps to understand what the NHS currently has in place and where the gaps are.

NHS Trusts do not have data modelling teams. Most do not have a formal data model at all. The closest thing to one is the EPR itself – the fields, the forms, the code sets and the workflow configurations that determine what data gets captured and in what structure. The EPR provides the de facto data model, but it was never designed to serve as one. It was designed to be the clinical record.

Across 220 Trusts in England, multiple different EPR systems are in use: Epic, Oracle Health (formerly Cerner), System C, Nervecentre, SystmOne, RiO, and others. Each produces a different underlying data model.

Most of the major systems were originally built for US healthcare settings, optimised for billing rather than for clinical analysis or operational management. They were exported to the UK and configured locally, and the configuration is where the divergence begins.

The EPR is only one source. Trusts also operate workforce management systems, rostering tools, clinical audit software, quality improvement trackers, estates and facilities systems, and dozens of other operational applications, each with its own data model, none aligned to any common standard. The divergence is not just a clinical data problem. It runs through every domain the NHS operates in.

Trusts typically have digital transformation teams who tailor the EPR to local clinical workflows. This is skilled and necessary work. But it is rarely documented in any formal sense. The changes are made in the system configuration, not written down in a data modelling tool or a design specification. There is no artefact that says, “This is what our data model is and this is why.” The model is implicit in the system rather than explicit in a document.

Even within a single Trust running a single EPR, the same operational process can output data in different formats across services and even across teams within the same service.

I saw this clearly in one London mental health Trust where the Child and Adolescent Mental Health Services (CAMHS) service had decided to open a new referral record for every appointment.

Other service lines in the same Trust used a single referral to encapsulate all appointments across a patient’s entire care episode. Both were valid approaches within the EPR’s configuration – both produced referral data. But the data meant fundamentally different things.

Reconciling them consumed hours of analyst time that nobody outside the data team ever saw. A national data model alone would not have prevented it – both teams were using the same EPR, the same logical model, the same referral object. The divergence happened because the product allowed it.

There is a deeper problem underneath this. Building a product that fits a clinical process requires knowing what that process actually is.

In many services, particularly in mental health and community care, the operating model is case management rather than a defined pathway. The clinician holds a caseload of patients whose care is driven by ongoing assessment, intervention, and review in a cycle that responds to changing needs. There is no predefined sequence of steps – each patient’s journey is shaped by clinical judgement in the moment.

When I was involved in building Frontline-First applications for a CAMHS waiting list at one Trust, the clinical team could not describe their process in terms that translated to a product specification. We used process mining to reconstruct a view of what was actually happening, building the picture from the event log of real activity rather than from what the service design stated should happen. The result was spaghetti.

But the team were not disorganised – they were doing case management as it is designed to work, case by case, responding to the patient in front of them. What was missing was not a standard for their clinical practice but a product designed to capture the data their practice generates in a way that is consistent and analysable.

I heard a senior leader in the FDP programme state that every patient is on a pathway and this should be a core concept in the implementation of the software. It is an attractive organising principle but it is aspirational rather than descriptive.

Many clinical teams do not work to a pathway in any formal sense, and building products as if they do creates friction rather than support. These services need products designed for their operational reality, not products that impose a pathway structure on work that does not follow one.

Even where processes are defined, they are not static. Quality improvement projects, cost improvement plans, and commissioning decisions all change clinical pathways, sometimes gradually and sometimes overnight. The tool has to be able to adapt.

This is where the modular approach has a structural advantage – updating a focused application to reflect a process change is faster and less risky than reconfiguring a monolithic EPR.

But variation is not just about change over time. It is also about compliance. Some clinicians follow the agreed process. Others adapt it to their own practice, sometimes for good clinical reasons and sometimes out of habit. I worked in one Trust where a psychiatrist used the EPR so differently from every other clinician that the data warehouse had to be built to accommodate his approach as a special case.

This is not unusual. Every Trust has examples. The question is whether the tool makes compliance easy enough that most people follow the agreed process most of the time, or whether it is so burdensome that deviation is the rational choice. This is the Frontline-First argument again – the easier it is to do the right thing, the more people will do it.

Semantic consistency therefore depends on three things working together: a shared data model, consistent products built against it, and the fit of those products to the clinical processes they support.

Where the clinical process is well understood, this combination works. Where the process is undefined, non-standard, or not followed, the product alone cannot fix it.

Clinical process variation across the NHS is a problem that this series of articles cannot solve. But the data model and the consistent products are the foundation. Whether consistency is achieved depends on the third element – products that fit how clinical teams actually work.

The downstream consequences of getting this wrong are real. It means research papers built on data that appears nationally consistent but carries hidden semantic variation; service evaluations that compare unlike with unlike; policy decisions informed by metrics whose definitions differ between the Trusts that contributed them; and performance reviews that hold clinical teams to account against numbers that do not accurately reflect what happened.

Getting the data model and the products right is not a technical nicety. It is the difference between a national data infrastructure that can be trusted and one that produces confident-looking analysis on unreliable foundations.

Currently there is no systematic way of measuring this. Weiskopf and Weng, in their 2013 framework for EHR data quality assessment, introduced a specific term for it: concordance, defined as the degree of agreement between the same data elements recorded across different sites or systems. It is a measurable dimension of data quality, distinct from completeness or correctness.

Research networks in other countries have built cross-site concordance assessments against their common data models. The NHS has nothing equivalent. Until the CDM and consistent products are in place and a concordance measurement framework is established, the scale of semantic divergence across the NHS will remain invisible, and every downstream use of the data will carry an unmeasured margin of error.

Why the CDM is the component most at risk of being neglected

The CDM is not more important than the products or the clinical processes they support – as this article has argued, all three are needed together. But the CDM is the component most likely to be neglected, for two reasons.

First, it is invisible to everyone except the people who build on it. Without a common data model, Trusts resort to point-to-point mappings between systems – each one hand-built, each one expensive to maintain, each one a source of semantic divergence that compounds over time. The NHS has been building these mappings for decades and the integration costs only grow. The CDM is the alternative – define the standard once, map to it once per source system, and everything built on top of it inherits the consistency.

Second, the CDM has ambitions beyond FDP. It is the foundation for the Single Patient Record and for every programme in the £10bn spending review allocation that depends on consistent NHS data. If the CDM is wrong, or under-governed, or under-resourced, it is not just FDP that suffers – it is the entire national data infrastructure for years to come.

There is a further dimension that is easy to overlook – every AI capability on the platform, from Ask FDP to the decision layer to AI-FDE, is only as good as the data it operates on. If semantic consistency across the NHS remains low, every AI application built on the data inherits that inconsistency. The AI gives confident answers grounded in data whose meaning varies between the Trusts that contributed it.

The CDM is not just the foundation for interoperability – it is the foundation for trustworthy AI in the NHS.

Why the Frontline-First approach depends on the CDM

Several of the Frontline-First dimensions described in Part 2 depend on the CDM.

Enrichment only works at national scale if the data means the same thing across Trusts. A consultant in Newcastle can only compare complication rates with peers in Brighton if both Trusts are recording procedures using the same definitions. Without the CDM, the comparison is misleading at best and dangerous at worst.

Cross-setting collaboration only works if the data model is consistent across care settings. The mental health patient in A&E can only have their community history surfaced if the acute and mental health data conform to the same model. Without the CDM, the A&E clinician sees either nothing or something they cannot interpret reliably.

Scalable, portable products only work if the platform the application is built on uses a standard model. Without the CDM, an application built by one Trust cannot be adopted by another without significant rework. The CDM is what makes portability possible. The Solution Exchange is the mechanism that makes it happen.

Operational products at the point of care only produce nationally consistent data if the products themselves are built against the CDM. A Trust can build its own discharge coordinator product on FDP for local use, but it cannot share that product nationally and the data it captures will not be nationally consistent unless the CDM is underneath it. The CDM is what turns a useful local product into a nationally reusable asset.

The Solution Exchange: how Frontline-First scales

The Solution Exchange is FDP’s mechanism for packaging and distributing operational products across Trusts. It is the commercial and technical model that turns FDP from a platform with a handful of nationally commissioned products into a platform with hundreds or thousands of products built by the people closest to the work.

The model works like this. A Trust team in Dorset builds a product that helps its clinical team manage patient flow. The product is not just built on FDP – it is built against the CDM, so the data it captures is nationally consistent.

But the CDM is only part of what makes it portable. The product itself embodies a consistent design for that clinical workflow – the fields, the validation rules, the actions, the decision points are all designed for the specific process the product supports. When a Trust in Newcastle adopts the same product, they get both the consistent data model and the consistent product design. They do not need to redesign the workflow, remap the fields, or rebuild the logic. They install it and it works.

This is made possible by Foundry’s built-in Marketplace feature, which is the platform capability underpinning the Solution Exchange. Marketplace allows a product to be packaged with all of its components, object types, datasets, applications, pipelines, and functions, into a single deployable package with automatic dependency resolution and version control. When the package is installed on another Trust’s FDP instance, environment-specific configuration is handled automatically and the ontology entities are remapped to the local instance.

This is not a manual export and rebuild – it is a managed deployment that preserves the product’s design integrity across every installation.

The architecture team at NHS England created a data governance function that actually got funded and staffed. That is a rare achievement in a system that has historically treated data governance as an afterthought
Tom Bartlett

The Solution Exchange is the NHS’s commercial and governance wrapper around this capability. NHS England is actively developing the framework, with engagement already underway with third-party suppliers who want to build products for the marketplace.

For technology companies, this creates a market that has never existed in the NHS – build once, deploy across 220 Trusts, with the CDM ensuring the data is consistent and the Marketplace ensuring the product installs cleanly.

For Trusts, it means access to a growing library of operational products, some built by other Trusts, some by commercial suppliers, all guaranteed to work because the platform, the data model, and the product design are consistent.

This is the mechanism by which the Frontline-First approach stops depending on what a central programme team can develop and starts depending on what the entire NHS and its supplier ecosystem can contribute.

Nationally built products like Optica and the Care Coordination Solutions are the starting point. The Solution Exchange is how the platform fulfils its potential. Without the CDM, the Solution Exchange is a marketplace with nothing portable to sell.

Where the CDM is now

I was close to this work during my time at NHS England and have seen it from the inside.

The CDM is alive, in use across all Trust FDP instances, and backing every national FDP product. It is published openly on GitHub and Trusts are actively engineering on it. The programme has created a functioning CDM governance process that is unusual in the NHS – a fortnightly review group, a dedicated Data Model Manager tool for navigating and extending the model, a suggestions inbox for Trusts to propose changes, and 98 entities in the core model.

The architecture team at NHS England created a governance function based on proper data management principles, using DAMA concepts, that actually got funded and staffed. That is a rare achievement in a system that has historically treated data governance as an afterthought. The Trusts that are using the process are consulting with peers before submitting changes to ensure their additions are not specific to their own systems. This is encouraging and deserves credit.

But there are problems that need honest acknowledgement. Data engineers and analysts working with the CDM are finding inconsistencies between what the model defines and what the data contains, and are questioning the relatively sparse nature of the model in areas that matter to their services.

These are not complaints from people who have not engaged. They are questions from people who are building on the CDM and finding it does not yet cover what they need. These questions deserve answers, and quickly, because unanswered questions erode confidence in the programme at exactly the moment when adoption needs to accelerate.

There is also a structural issue. Two parallel versions of the CDM developed during the programme.

The architecture team within NHS England built a carefully designed data standard, published on NHS Futures. The delivery team, under pressure to get Trusts live, built the operational model that actually powers the platform, published on GitHub. Both are public. Neither has been fully converged with the other. This happened because delivery was succeeding faster than the governance could keep pace with. That is a success problem, not a failure. But it means the governance now needs to catch up.

The governance process itself is not widely known across the adopting community. It is untested at the scale it will need to handle as more Trusts start building locally and extending the CDM for their own operational needs. The difference between adding new data items, which is relatively straightforward, and editing existing ones, which is a much harder governance challenge, has not been clearly addressed.

If the argument of this article is right, that consistent products matter as much as a consistent model, then governance cannot stop at the CDM. The Solution Exchange also needs governance over which products are endorsed for which clinical workflows.

Without this, the NHS risks five competing discharge products each producing a different interpretation of the same CDM fields, and the semantic consistency the CDM was designed to achieve is lost at the product layer.

External scrutiny of both the data model and the product standards should sit above the delivery team and the supplier – local agility to build and adapt should sit within both, bounded by the standards but not blocked by them.

There is a model for how this can work. The OHDSI community governs the OMOP Common Data Model across 600 million patient records in over 30 countries. Their approach is open source and community-driven – domain-specific working groups develop extensions in parallel, changes go through community review and testing across multiple sites, versioned releases provide clear migration paths, and automated tooling lets any site measure its own conformance. The model belongs to the community, not to a delivery team. It scales because it is distributed rather than centralised. It is also, in its own way, a Frontline-First approach to data modelling – the people closest to the data define what the model needs. The central role is to coordinate, quality assure, and release – not to design everything from the top down.

The NHS CDM governance should learn from this. It needs domain-based working groups covering acute, mental health, community, workforce, and cancer that can develop extensions in parallel rather than queuing through a single bottleneck. It needs automated conformance tooling so Trusts can measure their own alignment without waiting for a manual review. It needs versioned releases with clear migration paths so that Trusts building on the CDM today are not disrupted by changes tomorrow. It needs a clear process for edits to existing data items, not just additions of new ones. It needs visibility – the process should be known to every Trust building on FDP, not just the early adopters who happen to have found it.

All of this needs proper resourcing. The central architecture team at NHS England has been through endless recruitment freezes and headcount cuts that have left it unable to keep pace with the delivery team’s progress. A governance process without the staff to do the modelling, review the contributions, and maintain the standard is governance in name only.

The CDM should also not be built entirely by a central team. The workforce lens of the CDM is a case in point – while I was at NHS England, it was being written slowly by part-time staff, while mature workforce management systems operate at the frontline of care and could contribute their models directly.

There are many organisations in the private sector who support the NHS mission and have built mature systems that complement the NHS data estate. A well-governed CDM with a clear contribution process could draw on that expertise rather than trying to reinvent every model centrally from scratch.

Why this matters more than the supplier debate

The public conversation about FDP has been dominated by the question of whether Palantir should be the supplier. That question has its place. But the combination of the CDM and consistent products is more consequential than the choice of supplier.

A well-governed data model with consistent products on a mediocre platform would still deliver national consistency. A poorly governed data model with inconsistent products on a brilliant platform would deliver fragmentation dressed up as standardisation.

For 20 years, successive governments have left the NHS to solve its data problems on its own. It has tried its very best. I have watched some of the brightest minds in NHS informatics pour their careers into making it work with the tools and the funding available to them.

But it has not produced the result the NHS needs, because the problem is structural and the NHS cannot create sustained bandwidth for what looks to its executive teams like a niche technical detail. At every executive meeting and every board meeting across the organisation, the data perspective is drowned out by the healthcare emergency of the day. The bandwidth is never there, and it never will be without external intervention.

The government has boldly created the Health Data Research Service, recognising the need to join up data from across the four nations. But the research outputs it supports will be severely hampered if the underlying data cannot be integrated because it was never semantically consistent in the first place.

Ministers need to recognise the critical nature of this capability. If they do not, we will spend another 20 years building data infrastructure on foundations that cannot support what we need it to do, and the NHS will continue to make decisions on data that looks right but is not.

The CDM needs proper governance. But the organisation responsible for it cannot recruit, cannot retain, and is losing half its staff through voluntary redundancy. NHS England in its current state cannot steward the single most important piece of data infrastructure in the country. The gap is already being felt at every level.

When Palantir needed a data model to build the national FDP products, none existed, so it built its own. Separately, the Tech and Data Integration team in NHS England’s Data Services division built a parallel CDM and published it on NHS Futures. Locally, chief data officers in hospital groups who are expanding their use of FDP need a way of extending their local data models and aligning them with the national CDM. There is no single place where these three efforts converge, and no governance that connects them.

The NHS has had eight national data strategies over the past two decades. None of them has produced a governed, funded, enduring national data model.

The pattern is always the same – a strategy is published, a team is assembled, funding is allocated, restructuring hits, the team is dispersed, and the work is quietly absorbed into whatever comes next.

The CDM cannot survive another cycle of this. It needs a dedicated, well-funded body whose sole purpose is to steward the data model, the product standards, and the contribution process that this article describes. That body needs to be independent of the FDP delivery programme, independent of any single supplier relationship, and protected from the restructuring cycle that has dismantled governance functions in the NHS before.

This is not a problem the NHS can solve for itself. It is for government to establish and fund, and it is the single highest-value investment ministers could make in NHS data infrastructure. The CDM and the consistent products built against it are the one thing in this debate that supporters and critics of FDP alike agree is needed. Getting this right matters more than who supplies the platform.

Next in the series

The fifth and final article addresses the objections I hear most often, including whether the NHS needs a single platform at all, why we cannot just build our own, and why the answer matters more urgently than the current debate suggests.

Related Articles

Latest Articles