Replacing SaaS with on-premise AI: the unit economics that make sovereign-AI inevitable

TL;DRThe case for sovereign AI usually gets made on values: privacy, control, sovereignty. Those arguments are real.

The case for sovereign AI usually gets made on values: privacy, control, sovereignty. Those arguments are real. They also frequently lose when a CFO actually puts the spreadsheet together — because for most organisations the per-seat SaaS bill is small enough to feel like rounding, and the engineering cost of replacement is large enough to feel like a project. The interesting story is what happens when you do put the spreadsheet together properly, with all the second-order effects, over a 36-month window. The economics don't just favour sovereign AI in some niche scenarios. In a meaningful share of operating businesses, they make sovereign AI inevitable.

This piece is the spreadsheet, written out as prose, with the assumptions visible. It is opinionated and it shows its working.

The standard SaaS bundle, in 2026 reality

A representative mid-size operating business — say 30–50 seats with meaningful AI exposure — currently runs through a stack that looks like this:

Workflow / orchestration platform — per-seat or per-execution pricing. £200–800/month at this scale.
AI-augmented CRM with built-in copilot — per-seat licensing premium for the AI tier. £30–60/user/month above the base.
Document automation, AI-assisted writing tools, AI-assisted email, AI-augmented BI — each on its own per-seat or per-volume pricing line.
One or more frontier AI API budgets for whatever's not covered by the above. £200–2,000/month depending on volume.
Per-employee license for an AI-assisted IDE or coding tool, if engineering is a meaningful function. £15–40/seat/month.
The voice/transcription layer (meeting notes, call summaries, voice tooling). Another per-seat tier.

Total monthly bill at this scale typically lands somewhere between £2,500 and £10,000. Per-seat costs scale with headcount; per-volume costs scale with usage. Both lines tend to grow faster than headcount once AI usage actually gets adopted. The CFO sees a bill going up and to the right.

What the on-premise alternative looks like

The on-premise replacement is not a single monolithic system. It's a small constellation of self-hosted components serving the same functional roles, sharing infrastructure:

Workflow engine — self-hosted on a small server. One-time setup cost, near-zero ongoing.
Local inference cluster — Apple-Silicon or equivalent, sized for working concurrent load. Capital expenditure £3,000–10,000 depending on spec, amortised over 3–4 years.
Self-hosted CRM — open-source platform, run on shared infrastructure. The AI augmentation comes from the local inference layer rather than a vendor's per-seat tier.
Self-hosted knowledge graph and document store — open-source databases, shared infrastructure.
Self-hosted observability — open-source monitoring stack, near-zero ongoing.
Frontier AI budget — meaningfully smaller, since most workloads serve from local. £100–600/month for the cases that genuinely need frontier capability.

Total ongoing bill, after setup: typically £200–1,500/month including the frontier AI overflow and shared infrastructure costs. Capital expenditure on the inference hardware sits separately and amortises over years. The payback period — month at which the cumulative savings cross the cumulative setup cost — is usually 12–18 months for businesses at the scale described.

The full unit economics, with second-order effects

The simple savings calculation understates the case. The second-order effects are bigger than the first-order:

Workload elasticity. On per-volume SaaS, scaling AI usage means scaling the bill. On local infrastructure, scaling AI usage means using more of the capacity you already paid for. Once the inference cluster is sized, you can ten-times your AI calls without increasing the bill. SaaS economics punish adoption; on-premise economics reward it.
Composition cost. Each SaaS tool has its own integration story, often paid (per-connector pricing). On-premise tools share a single bus and integrate at zero marginal cost.
Vendor change cost. When a SaaS vendor changes their AI provider, raises prices, deprecates a feature, or restricts your region, you absorb the cost. On-premise is insulated from these.
Data sovereignty value. For organisations with regulatory exposure, the cost of a data-residency incident on a SaaS stack is multiples of the annual SaaS bill. On-premise reduces this risk to near-zero. This rarely shows up in the spreadsheet but should.
Tool consolidation. A self-hosted AI layer often replaces 4–6 separate SaaS tools (one per use case) with a single capability layer used everywhere. The hidden tax of "yet another AI tool" — onboarding, training, vendor management — disappears.

The honest costs people don't talk about

The case for on-premise is not free. The costs are real and need to be on the spreadsheet:

Setup engineering. Standing up the inference layer, the orchestration platform, the CRM, the knowledge graph and the observability is a project. For most teams, 4–8 weeks of focused engineering attention or an external consultant equivalent.
Operational ownership. On-prem isn't run-and-forget. Someone has to monitor it, update it, respond when the local inference cluster has a bad day. This is typically 5–10% of an engineer's time once stable, more during change.
Hardware lifecycle. Inference hardware needs refresh. Plan for a 3–4 year amortisation, not lifetime. The good news: capability per pound of hardware has been improving meaningfully each generation, so refreshes upgrade rather than just replace.
The capability ceiling cost. Local inference cannot replace frontier capability. Workloads that genuinely need frontier reasoning still go to a hosted endpoint. Plan for a residual hosted bill — non-zero but small — and don't pretend otherwise.
The skills shift. Running on-prem requires different skills than configuring SaaS. If your team doesn't have them, you either hire, train, or partner. Don't underestimate this — it's the single most common reason on-prem rollouts stall.

When SaaS is still the right answer

I am not arguing for full SaaS displacement. There are clear cases where SaaS remains the right answer:

Sub-scale operations. Below ~10 seats with low AI volume, the SaaS bill is small and the ops cost of on-prem doesn't pay back.
Highly bursty workloads. If your AI usage spikes 10x for two weeks a year and is otherwise moderate, sizing local for the peak is wasteful. Cloud absorbs the spike at meter-rate.
Frontier-capability dependent products. If your core product depends on the latest frontier model's reasoning quality, you're paying the API cost and shouldn't replicate that locally.
Highly regulated, industry-specific SaaS where the SaaS vendor handles certifications and audit-readiness for your sector. Recreating that posture on-prem costs more than the licence.

The pragmatic shape for most operating businesses at scale is hybrid: a strong on-prem core for the workloads that fit, surgical SaaS for the workloads where SaaS is genuinely the right tool.

The migration sequence

If the unit economics make sense, the migration is its own engineering exercise. The sequence I run with:

Stand up the local inference layer first. Don't try to migrate any workload until the substrate works. Run it in a non-critical role for a month to shake out the operational issues.
Migrate the lowest-risk, highest-volume workload next. Usually classification, extraction or summarisation work. Volume guarantees the savings show up; low risk means a regression doesn't blow up the business.
Build the model router. Even before you've migrated everything, the router lets you do controlled rollouts and rollback easily. This is the architectural keystone.
Migrate the orchestration platform. The workflow engine itself — once that's local, every workflow you build is on-prem-native.
Migrate the CRM, knowledge graph, observability. Each one is its own sub-project. Run old and new in parallel for a transition window before cutting over.
Decommission what you no longer need. Cancel the SaaS lines as the on-prem replacements prove out. This is where the savings actually land in the P&L.

What we see in our own books

For our own operations, the migration started in early 2024 and finished in late 2025. The numbers are honest: the inference cluster paid back inside 14 months. The aggregate ongoing AI-related SaaS spend dropped by roughly 80% versus the 2023 baseline, while the volume of AI-touched workflows multiplied roughly 10x. The hours-per-week of operational ownership stabilised at about half a day. The capability ceiling residual — the workloads still going to frontier endpoints — sits at around 20% of total inference calls and is steadily falling as open-weights models close the gap.

This is one operator's data, not a representative survey. But the shape — non-trivial setup cost, fast payback, ongoing savings that compound, AI usage rising with no marginal cost — is what I see across every serious sovereign-AI rollout I've worked on. The economics are genuinely good. The blocker is rarely cost-benefit; it's almost always the skills shift and the willingness to own the operational layer.

The strategic angle most CFOs miss

The pure cost analysis above understates the strategic case in one important way: it treats SaaS as a stable platform, when in practice the SaaS layer is being aggressively repriced as AI capabilities are bundled into it. Vendors that historically charged £20/seat/month are now charging £80/seat/month for the AI tier, and the AI tier is increasingly the only tier on offer. The bill is going up not because you're using more — it's going up because the vendor is repricing the capability you depend on.

This is the rational vendor strategy. AI features have a perceived value that supports higher pricing, and switching costs make the price increase sticky. The customer-side response should be: own the AI layer. Once you own it, the vendor's repricing power vanishes. You can choose which tier of which SaaS product you actually need based on the product's non-AI value, and supply the AI layer yourself.

This is the move that goes from defensive cost management to offensive position-building. The AI layer is becoming a core piece of operational infrastructure for any business at scale. The companies that own it will, over the next three to five years, have meaningful structural advantages over the companies that rent it. That's not a quarterly P&L story; it's a 36-month strategic story. The CFO who makes the case in those terms wins the room.

The values arguments for sovereign AI — privacy, control, vendor independence — are real but they are the persuasion frame, not the hard case. The hard case is unit economics. Once you put the second-order effects on the spreadsheet — workload elasticity, vendor change cost, tool consolidation, data sovereignty — the on-premise alternative wins for most operating businesses past a small scale. The payback periods are short. The compounding is real. The blocker is the skills shift, not the maths.

If you've been wondering whether the move is worth it, do the spreadsheet properly. Most of the time, the answer is yes, and you've been paying a tax on optionality you weren't actually using.

Run your sovereign-AI numbers If the unit economics in this piece sound like they could apply to your stack, book a sovereign-infrastructure consultation and we'll model your specific case. Book a sovereign-infrastructure consultation →