Microsoft Sovereign Cloud Now Runs AI Models Completely Offline — And That Changes Everything
The Challenge
Here's a tension that comes up in almost every conversation I have with public sector and regulated industry IT leaders: they want AI, but they can't send data to the cloud. Not won't — can't. Regulatory requirements, classification levels, or operational risk profiles simply don't allow it.
Until now, the answer was usually "wait" or "build something bespoke." Sovereign cloud offerings gave you infrastructure control, but AI capabilities were still tethered to connectivity. You could run your Exchange Server locally, but if you wanted large language model inference, you needed a network path to Azure OpenAI. For defence, intelligence, critical national infrastructure, and heavily regulated financial services, that was a non-starter.
The result? These organisations — the ones with arguably the most to gain from AI — were the last to get access. And the gap was widening with every new model release.
What's Changed
Microsoft's latest Sovereign Cloud update tackles this directly with three capabilities that, together, form what they're calling Sovereign Private Cloud — a full-stack experience spanning infrastructure, productivity, and AI that runs entirely within the customer's boundary.
Azure Local Disconnected Operations (GA now)
Azure Local already let organisations run Azure-consistent infrastructure on-premises. The disconnected mode takes that further: management, policy enforcement, and workload execution all stay local with zero cloud connectivity required. You get the same Azure governance model — RBAC, policy, compliance — without ever phoning home.
This isn't just about intermittent connectivity. It's designed for environments where external dependencies are unacceptable by policy. Think classified networks, submarine operations, remote field deployments, or air-gapped financial systems.
Microsoft 365 Local Disconnected (GA now)
Exchange Server, SharePoint Server, and Skype for Business Server can now run inside the customer's sovereign boundary on Azure Local. Microsoft is committing support through at least 2035.
Yes, Skype for Business in 2026. Before you raise an eyebrow — in disconnected sovereign environments, you work with what's available and supported. These aren't organisations choosing between Teams and Skype. They're choosing between "productivity tools that work offline" and "nothing."
Foundry Local with Large AI Models
This is the headline. Foundry Local now supports large multimodal models running on customer hardware inside sovereign boundaries. Using NVIDIA GPU infrastructure, organisations can run inference locally — no external API calls, no data leaving the boundary, no connectivity required.
Previously, Foundry Local was limited to smaller models. This expansion means you can run the same class of models available in Azure AI Foundry, but entirely on-premises with local APIs. Microsoft provides support for deployment, updates, and operational health.
Getting Started
If your organisation operates in sovereign, classified, or regulated environments, here's the practical path:
-
Assess your connectivity posture. Map which workloads genuinely require disconnected operation versus those that could operate in a hybrid or intermittently connected mode. Not everything needs to be air-gapped.
-
Evaluate Azure Local. If you're not already running Azure Local (formerly Azure Stack HCI), start there. It's the infrastructure foundation for everything else. The disconnected operations documentation covers the deployment model.
-
Scope your AI use cases. Foundry Local with large models is available to qualified customers. Work with your Microsoft account team to understand the hardware requirements (NVIDIA GPU infrastructure) and model availability for your classification level.
-
Plan the productivity layer. If your users need email, document collaboration, and communications in disconnected environments, Microsoft 365 Local is the path. The 2035 support commitment gives you a long runway for planning.
One important caveat: "qualified customers" for Foundry Local large models means this isn't self-service. Expect an engagement process with Microsoft to validate your use case, infrastructure, and security posture.
What This Means
The strategic shift here is significant. Microsoft is saying: sovereignty and AI capability are not mutually exclusive. You don't have to choose between running models and maintaining control.
For UK public sector and defence — where the conversation about sovereign AI has been especially active — this removes one of the biggest blockers. The question moves from "can we use AI at all?" to "which models and use cases make sense for our operational boundary?"
There's a broader pattern too. Between Azure Local, Microsoft 365 Local, and Foundry Local, Microsoft is building a complete disconnected stack that mirrors the cloud experience. That's a bet that the market for fully sovereign, fully disconnected AI is large enough to justify the engineering investment. Given the geopolitical direction of data sovereignty regulation globally, that bet looks sound.
The honest constraint: running large models on local hardware is expensive. You're provisioning NVIDIA GPUs in your own datacentre, not consuming them as a managed service. The economics of sovereign AI are fundamentally different from cloud AI, and organisations need to budget accordingly.
But for the customers who need this — and there are more of them than most people realise — cost is not the primary concern. Control is. And that's exactly what this delivers.
Leon Godwin, Principal Cloud Evangelist at Cloud Direct