Your Agents Are Not Software Yet. Here Is How to Fix That.
Your Agents Are Not Software Yet. Here Is How to Fix That.
Most organisations building AI agents in Copilot Studio are treating them like chatbot experiments. They click through the web interface, test by hand, deploy by exporting a solution file, and hope nothing breaks when someone else edits the same topic at the same time. This is not a development workflow. This is a prayer.
Microsoft quietly made a significant move in January 2026 that most IT leaders missed entirely. The Copilot Studio extension for Visual Studio Code went generally available. On the surface, it sounds like a developer convenience. In practise, it represents a fundamental shift in how enterprises should be building, governing, and deploying AI agents.
If you are building agents without source control, pull requests, and deployment pipelines, you are accumulating technical debt at a rate that will become very expensive to unwind.
The Problem: Agent Sprawl Without Engineering Discipline
Here is what I see across almost every organisation that has started building agents in the last twelve months. Someone in the business creates a Copilot Studio agent. It works well enough. Word spreads. Three more people start building agents. They are all working in the same environment, editing topics directly in the browser, with no version history beyond whatever the platform happens to track.
Now multiply that by ten teams, across three departments, in an organisation that already has governance frameworks for its codebase but somehow decided that AI agents exist outside those frameworks.
The result is predictable. Conflicting changes. No audit trail worth examining. No way to roll back to a known good state. No code reviews. No testing gates. No deployment pipeline. Agents that work in development but fail in production because someone changed a connector setting and nobody noticed.
This is not a hypothetical. I have seen it happen in organisations that are otherwise rigorous about their software development lifecycle. The problem is cultural as much as it is technical. Because agents are built in a low-code interface, they are not treated as software. But they are software. They make decisions, call APIs, access business data, and interact with customers. If that does not warrant engineering discipline, I do not know what does.
The Reality: Low-Code Does Not Mean Low-Rigour
There is a persistent misconception that low-code platforms are somehow exempt from proper development practices. The thinking goes something like this: if business users are building it, we cannot impose developer workflows on them. This is wrong in a way that becomes obvious the moment an agent does something unexpected in production and nobody can explain what changed or when.
The real tension is not between low-code and pro-code. It is between velocity and governance. Organisations want to move fast with agent development, which is entirely reasonable. But speed without structure is not agility. It is chaos with a deadline.
What most teams actually need is a way to keep the accessibility of the visual builder whilst adding the guardrails that prevent agents from becoming unmanageable. That is precisely what the Copilot Studio VS Code extension was built to address.
The Solution: Treating Agents as First-Class Software Artefacts
The Copilot Studio extension for VS Code brings agent development into the same workflow that engineering teams already use for application code. Here is what that looks like in practise, broken into three stages that any organisation can adopt incrementally.
Stage 1: Clone and Own Your Agent Definitions
The extension lets you pull the complete agent definition from Copilot Studio into a local workspace. Topics, tools, triggers, knowledge references, settings. Everything that defines what your agent is and how it behaves, represented as structured files on your machine.
This is the foundation. Once the agent definition lives in a folder rather than exclusively in the cloud, you can do things that were previously impossible or impractical. You can search across all topics at once. You can use find-and-replace. You can open multiple files side by side. You can work offline. These sound like small things until you are managing an agent with forty topics and need to find every reference to a specific entity.
The extension provides syntax highlighting and IntelliSense-style completion for the agent definition format, which means fewer errors during editing and faster navigation through complex agent configurations.
Stage 2: Git Workflows for Agent Governance
This is where the real value emerges. Once your agent definitions are local files, they can live in a Git repository. That single change unlocks everything that modern software teams take for granted.
Version history. Every change to every topic, tool, and trigger is tracked with full context. Who changed it, when, and why. No more guessing what happened between Tuesday and Thursday when the agent started behaving differently.
Pull requests. Changes go through review before they reach the agent. A second pair of eyes catches the topic that accidentally removed a guardrail or the tool configuration that points at the wrong endpoint. This is not bureaucracy. This is how you prevent production incidents.
Branching. Multiple people can work on the same agent simultaneously without overwriting each other. Feature branches for new capabilities. Hotfix branches for urgent corrections. The merge workflow that developers have been using for two decades, now applied to agent development.
Conflict resolution. The extension includes a diff view that compares your local changes against the cloud state. Before you apply updates, you can see exactly what will change and resolve any conflicts deliberately rather than discovering them after the fact.
Stage 3: Pipeline-Driven Deployment
With agent definitions in source control, you can integrate them into your existing CI/CD pipelines. Azure DevOps, GitHub Actions, whatever your organisation already uses. The agent moves through environments with the same rigour as your application code: development to staging to production, with approval gates at each transition.
This is the missing piece that makes agent governance practical at scale. It is not about adding process for the sake of process. It is about ensuring that the agent running in production is the one you tested, reviewed, and approved. Not the one someone edited directly in the browser ten minutes ago.
The AI-Assisted Inner Loop
Here is where this connects to the broader shift in how software gets built. Because the agent definition is now a set of structured files in VS Code, you can use GitHub Copilot, Claude, or any AI coding assistant to help you write and refine agent components.
Need a new topic that handles a specific customer inquiry? Describe what you want in natural language and let the AI assistant draft it. Need to update tool configurations across multiple topics? Let the assistant handle the repetitive changes whilst you focus on the logic that matters.
This is vibe coding applied to agent development. You describe the intent, the AI generates the structure, you review and refine, then sync back to Copilot Studio to test. The inner loop gets faster without sacrificing the governance that the outer loop provides.
It is a compelling model because it addresses both sides of the velocity-governance tension. Makers and developers work faster because AI helps with the mechanical parts. The organisation maintains control because every change still flows through source control and review.
What This Means for Evaluating Agent Quality
There is a related development worth understanding. Microsoft recently released the Evals for Agent Interop starter kit, a framework for evaluating agents against realistic business scenarios. Think of it as a testing use specifically designed for the way agents interact with Microsoft 365 surfaces like email, calendar, and documents.
The starter kit provides curated scenarios and representative data, along with rubrics that measure quality, efficiency, and robustness. Organisations can run their agents through these scenarios to benchmark performance, compare different implementations, and verify improvements before production deployment.
When you combine this with the VS Code extension workflow, you get something close to a proper software development lifecycle for agents. Build locally with AI assistance. Review through pull requests. Test with standardised evaluations. Deploy through pipelines. Monitor and iterate.
This is how agent development matures from experimentation to engineering.
The Impact: What Changes When You Do This Properly
Organisations that adopt this workflow consistently report three measurable improvements.
Reduced production incidents. Pull request reviews catch configuration errors before they reach users. Version history means you can always roll back. The combination eliminates the most common source of agent failures: unreviewed changes deployed directly to production.
Faster development cycles. This sounds counterintuitive. Adding governance should slow things down, right? In practise, the opposite happens. Teams spend less time debugging mystery changes and resolving conflicts manually. The AI-assisted editing loop accelerates the creative work. The net effect is more agent capabilities shipped per sprint, not fewer.
Audit readiness. For regulated industries, this is not optional. Having a complete, immutable history of every change to every agent, with approvals and review comments attached, transforms compliance from a retrospective exercise into a continuous state. When the auditor asks what changed and why, you have a definitive answer.
Getting Started This Week
You do not need to overhaul your entire agent development process overnight. Start with one agent that matters.
First, install the Copilot Studio extension from the Visual Studio Marketplace. Clone your most important agent to a local workspace. Commit it to a Git repository. That gives you version history immediately.
Second, establish a simple branching convention. Main branch reflects production. Feature branches for new work. Pull requests required before merging to main. You can enforce this with repository policies in Azure DevOps or GitHub.
Third, connect the repository to your deployment pipeline. Even a basic pipeline that syncs the agent definition from the main branch to your production environment is a massive improvement over manual deployment.
Each of these steps takes less than a day to implement. The compound effect over weeks and months is effective.
Where I Think This Is Heading
The Copilot Studio VS Code extension is not just a productivity tool. It signals where Microsoft sees agent development going. Agents are converging with software engineering. The tools, practices, and governance frameworks that we built for application development over the past three decades are being adapted for agent development.
Organisations that recognise this early and invest in proper agent engineering practices now will have a significant advantage. Not because their agents will be more sophisticated, but because they will be more reliable, more auditable, and more maintainable.
The ones that treat agents as disposable experiments, built in a browser and deployed on a prayer, will discover that technical debt in agent development compounds just as fast as it does everywhere else.
If your team is building agents and you want to talk through what a proper engineering workflow looks like for your organisation, I am always happy to have that conversation. Reach out on LinkedIn or drop me a message through Cloud Direct.