The deferred cost of cloud AI lock-in

TL;DRThe decision to build an AI stack on a single closed-vendor platform looks rational on day one. The integration is smooth, the pricing is competitive, the documentation is clear, and the time-to-first-useful-output is me…

The decision to build an AI stack on a single closed-vendor platform looks rational on day one. The integration is smooth, the pricing is competitive, the documentation is clear, and the time-to-first-useful-output is measured in days rather than weeks. The decision continues to look rational at month six and at month twelve. The reckoning arrives somewhere between month twenty-four and month thirty-six, when the team starts to notice that the asset they have been building is not theirs in any meaningful sense.

This piece is the long-form version of a conversation I have been having with founders and CTOs more or less weekly. The framing of migration cost as a Day 1 decision input is, in my experience, where most teams quietly lock themselves into a future they will spend a year and a budget cycle escaping from. The right time to plan portability is at month one, not month thirty-six.

What actually accumulates inside a closed AI vendor

The simple model of cloud AI dependency is the inference layer — your application calls the vendor's model API, the vendor returns tokens, you pay per token, and switching vendors is a matter of swapping the API endpoint. This model is roughly accurate for the first six months of any deployment and roughly fictional thereafter.

What actually accumulates inside a serious AI deployment, in addition to inference traffic, is a set of assets that gradually become load-bearing for the application:

The embedding cache. Every document the system has indexed has been embedded by the vendor's model and stored in the vendor's vector store. The embeddings are dimensioned to that specific model and are not directly portable to another.
The prompt scaffolding. Hundreds of prompts, refined over months against the vendor's specific model behaviour, with idioms that work on this model and break on others.
The tool descriptions. Function definitions tuned to the calling conventions of this vendor's tool-use API, with field names and schemas that map to the vendor's preferences.
The knowledge graph. Entities, relationships, and metadata accumulated across the corpus, often stored in a structure tied to the vendor's retrieval format.
The conversation memory. User-level conversation state, often stored in vendor-specific session structures that do not export cleanly.
The evaluation harness. Test cases and golden outputs benchmarked against the vendor's model, which means the harness measures fit-to-this-vendor rather than fit-to-task.

Each of these is an asset on day one and a liability on day one thousand. The vendor lock-in is not the inference traffic; it is the asset accumulation. The team that does not understand this is the team that has not yet been priced for migration.

The migration cost curve

The shape of the migration cost over time is the single most important diagram nobody draws when adopting a closed AI vendor. At month one, migration cost is essentially the engineering time to swap an API endpoint and update some configuration — a sprint, maybe two. At month six, the team has accumulated some prompt scaffolding and a small knowledge index, and migration is a meaningful project — probably a quarter. By month twenty-four, migration is a major engineering initiative that competes with feature development for an entire planning cycle, and by month thirty-six it is a strategic decision that requires executive sponsorship and a multi-quarter plan.

The curve is not linear. It compounds. The reason is that each new asset accumulates dependencies on every other asset, and the dependency graph grows faster than the asset count. By month thirty-six, the migration is not really a code change; it is a re-architecture, with all the planning, testing, and cutover work that implies.

The dependency types ranked by migration friction

Not every dependency is equally painful to migrate. The ranking that has held up across migrations I have either run or advised on:

Dependency	Day 1 cost to swap	Day 1000 cost to swap	Multiplier
Inference endpoint	1 sprint	2 sprints	2x
Tool descriptions / function calls	1 sprint	1 quarter	~6x
Prompt scaffolding	0.5 sprints	2 quarters	~24x
Embedding cache + vector store	1 sprint	3 quarters	~36x
Knowledge graph structure	2 sprints	1 year+	~50x
Evaluation harness	0.5 sprints	2 quarters	~24x
Conversation memory format	1 sprint	3 quarters	~36x

The interesting result is that the cheapest thing to swap on day one — the inference endpoint — is also the cheapest thing to swap on day one thousand. The expensive things are the assets that accumulate model-specific decisions: the prompts that were tuned to this model, the embeddings that were dimensioned to this model, the schema choices that match this vendor's preferences. These are the dependencies that compound, and they are exactly the ones the day-one decision tends to ignore.

Why portability has to be designed in at month one

The architectural disciplines that preserve portability are not exotic. They are also, almost universally, the ones teams skip when they are moving fast and shipping features. The pattern that survives a migration well shares a few invariants:

The inference layer is wrapped. The application talks to a thin abstraction, not directly to the vendor's SDK. The wrapper handles request shaping, retry logic, and observability uniformly across whichever underlying model is serving the call.
The embedding model is owned. Even if the inference layer is rented, the embedding pipeline runs on a model the team controls. This decouples the index from any single inference vendor and makes inference portable without touching the corpus.
The vector store is portable. The retrieval layer uses a vector database that can run on infrastructure the team controls, with the embeddings the team's own pipeline produced. Migration of the inference layer leaves the index intact.
Prompts are model-agnostic where possible. The prompt scaffolding is structured around behaviours and constraints that work across multiple model families, with model-specific refinements isolated to a small adapter layer.
The knowledge graph is in a neutral format. Entities and relationships live in a representation that is not tied to any single vendor's retrieval API.
The evaluation harness measures task fit, not model fit. Test cases evaluate whether the output meets the requirements of the task, not whether it matches the previous output of the previous model.

None of these add meaningful day-one engineering cost when designed in from the start. All of them are extremely expensive to retrofit. The teams that build them on day one have a stack that survives any single vendor's pricing change, terms revision, or strategic pivot. The teams that do not have a stack that gets quietly more expensive to leave every month, until leaving is no longer practical inside a normal planning horizon.

The terms-change risk that nobody prices

The simple model of vendor risk is pricing — the vendor raises prices, the team's margins shrink, eventually a switching decision becomes economic. The actual model is broader and considerably more uncomfortable. Vendors change three things over a thirty-six-month horizon: pricing, data terms, and capability access. Any one of them can force a migration on a timeline the team did not control.

Pricing changes are the most visible. The historic norm in cloud AI has been pricing that falls over time as compute economics improve, but recent vintages have seen the opposite — premium tiers introduced with capability gating, free tiers narrowed, generous early-access terms revoked. The direction of travel is towards more pricing complexity, not less.

Data terms changes are the most consequential. A vendor who shipped under a no-training data clause and then revises it has effectively reclassified the team's corpus. A vendor whose retention window quietly extends turns historical conversations into a long-tail liability. A vendor acquired by another company inherits whatever data position the acquirer prefers. None of these are unusual events. All of them have happened to closed AI vendors in the last two years.

Capability access changes are the most operationally disruptive. A model the team has tuned around gets deprecated. A feature the team depends on becomes premium-tier-only. A region the team serves loses local availability. The team has no leverage in any of these conversations because the dependency is one-way.

What "plan portability at month one" actually looks like

The Day 1 portability checklist that costs almost nothing to implement and saves a year of pain at migration time:

Pick an embedding model the team can run on its own infrastructure, even if the team is renting inference capacity for now. The embedding decision locks the corpus; make it once, deliberately, against a model with portability properties.
Run the vector store on infrastructure the team controls. Hosted vector databases are convenient on day one and an immovable dependency by day one thousand.
Wrap every inference call in a thin adapter that exposes a uniform interface. The adapter is where model-specific quirks live; the application calls the adapter.
Treat prompts as code. Version them, test them, evaluate them on a model-agnostic harness. The prompts that survive a migration well are the ones that were never tightly coupled to a single model's behaviour.
Store knowledge-graph entities and relationships in a neutral format — JSON, RDF, or a database schema the team owns — not in a vendor-specific retrieval structure.
Maintain an evaluation harness that runs against multiple models. The day a vendor's pricing changes, having three months of comparative data on alternative models is the difference between a measured migration and a panicked one.

This is not heroic engineering. It is the small set of decisions that, made deliberately at month one, preserve the team's optionality at month thirty-six. The teams that have done this end up with a stack where any single vendor can be swapped within a quarter. The teams that have not end up with a stack where the vendor's strategy is, in effect, also their strategy.

When closed-vendor dependency is actually fine

The argument for portability is not an argument against closed AI vendors. It is an argument for designing the boundary deliberately. There are three categories where leaning hard on a closed vendor is the rational choice:

Frontier reasoning on tasks where capability beats portability. If the work genuinely requires a frontier closed model and no open-weights alternative is close, use the closed model. The dependency is real but the alternative is not building the product at all.
Time-bounded experiments where the team is testing whether AI fit is real before investing in infrastructure. Closed-vendor convenience accelerates the experiment; the embedded asset accumulation is small while the experiment is short.
Workloads that are intrinsically not strategic — internal tooling, throwaway analyses, work where the corpus and prompts have no compounding value. The asset-accumulation argument does not apply when there are no assets.

Outside these three, the architectural default should be portability-first, with closed-vendor leverage as a deliberate choice on specific lanes rather than a structural commitment across the whole stack.

The deferred cost of cloud AI lock-in is not a marketing concern; it is a structural feature of how AI assets accumulate against a vendor over time. The migration cost curve compounds because the asset graph compounds, and the team that has not designed for portability at month one is the team that will discover, somewhere around month twenty-four, that their strategy is now downstream of someone else's pricing committee.

The right move is to design the boundary deliberately at the start. Wrap the inference layer; own the embedding model; control the vector store; treat prompts as portable code; keep the knowledge graph in a neutral format. None of this is expensive on day one. All of it is enormously expensive to retrofit. The teams winning the long game in AI in 2026 are the ones that priced this on day one and built accordingly. The teams that did not are running the same migration project everyone else is running, three years late and one budget cycle short.

Get on the newsletter Long-form analysis on sovereign infrastructure, AI vendor strategy, and the architectural choices that compound across years. Once a fortnight, no upsell. Join the newsletter →