Correct Code Is No Longer Enough: Why AI Agents Need a New SDLC

Traditional software development was built for deterministic systems. Agentic AI introduces runtime decisions, external actions, and new failure modes that require architecture and governance.

Jun 01, 2026

Software engineering was built for a deterministic world.

Same input. Same logic. Same output.

That assumption shaped the way we designed systems, tested systems, deployed systems, and governed change. The traditional Software Development Lifecycle, or SDLC, was built around one central question:

Did we build the software correctly?

If the code was clean, the tests passed, the security controls were in place, and the deployment pipeline worked, we could reasonably assume the system would behave as designed.

That assumption worked when software followed explicit paths written by developers.

It starts to break when the software itself begins to participate in decisions.

AI agents do not behave like normal application features. They interpret context. They make bounded decisions under uncertainty. They use tools. They call APIs. They coordinate actions. They may trigger workflows across systems that sit outside the original application boundary.

That changes the architectural problem.

The question is no longer only:

Did we build the software correctly?

The new question is:

Did we design, constrain, verify, and operate the agent safely and effectively?

That is a very different discipline.

The Old SDLC Was Designed for Predictable Behaviour

Traditional software architecture assumes that system behaviour comes from code and configuration.

A developer writes the logic.
An architect defines the integration pattern.
A tester checks the expected outcomes.
A release pipeline moves the change into production.

This model works because the system is deterministic enough to be tested against known scenarios.

Of course, production systems still fail. APIs break. Databases slow down. Infrastructure goes down. Users do unexpected things. But the behaviour of the software is still mostly traceable to something we designed, coded, configured, or deployed.

AI agents introduce a different type of uncertainty.

An agent may receive a new context, interpret that context differently, select a tool, call an external service, and generate a real operational effect.

The code may be correct.

The infrastructure may be available.

The integration may be working.

And the system may still do the wrong thing.

That is the uncomfortable part.

Correct code does not automatically mean controlled behaviour when the software is capable of interpreting, deciding, and acting.

AI Agents Are Not Just Features

One of the biggest mistakes organisations make is treating AI as a feature to be inserted into an existing delivery model.

A chatbot here.
An automation layer there.
A workflow assistant inside an existing platform.
A model connected to a few internal tools.

At small scale, this feels manageable. The agent is seen as a productivity enhancement, not a change to the architecture of control.

But an AI agent is not just another module.

Once an agent can use tools, access data, make decisions, or trigger actions, it becomes part of the operational fabric of the organisation. It is no longer only producing text. It is participating in business execution.

That means the architecture must change.

The model is only one component. Around it, the organisation needs policies, boundaries, telemetry, approval flows, exception handling, audit trails, and human escalation paths.

Without those controls, the organisation is not deploying intelligence.

It is deploying uncertainty.

The New Failure Modes Are Behavioural

Traditional engineering teams are trained to look for familiar failures.

Does the code compile?
Do the tests pass?
Is the API available?
Is the database healthy?
Is the deployment secure?
Is the service observable?

All of this still matters.

But agentic systems introduce failure modes that standard CI/CD pipelines were not designed to catch.

An agent may take an action it was not authorised to take.

It may misunderstand the meaning of a business instruction.

It may use the right tool for the wrong purpose.

It may act on incomplete evidence.

It may produce a decision that cannot be explained later.

It may complete a workflow successfully from a technical point of view, while still violating business policy, compliance requirements, or customer expectations.

These are not simple software bugs.

They are failures of control.

This is why testing an agent cannot stop at functional correctness. We also need to test intent, context, authority, evidence, and escalation.

The real question becomes:

Can we prove the agent remained inside its authorised operating boundary?

Agentic Architecture Requires Runtime Control

Traditional architecture relies heavily on design-time structure.

We define the components.
We define the interfaces.
We define the data flows.
We define the security controls.
We define the deployment model.

Agentic architecture needs all of that, but it also needs active runtime discipline.

The reason is simple: agentic behaviour emerges at runtime.

It emerges from model outputs, live context, policy data, tool responses, user instructions, environmental signals, and previous state.

That means control cannot live only in a design document.

It must be enforced while the agent is operating.

Policies need to be machine-readable.
Boundaries need to be executable.
Approvals need to be built into the workflow.
Telemetry needs to capture not only what happened, but why the agent acted.
Escalation needs to be part of the operating model, not an afterthought.

This is where many AI initiatives fail.

They focus on model capability before they design operational control.

They ask, “What can the agent do?”

Architects need to ask a more important question:

What is the agent allowed to do, under what conditions, with what evidence, and with what human oversight?

The Six Design Pillars of a Governable Agent

Before building an agentic system, architecture needs to define the control model.

I would frame this around six design pillars.

1. Intent Design

Intent design defines the purpose of the agent.

Not in vague terms like “improve productivity” or “automate support”.

It must define the agent’s specific goal, the business outcome it supports, and the negative space around that goal.

Negative space matters because it clarifies what the agent must not do.

A well-designed agent needs a narrow and explicit purpose. The broader the intent, the harder the system is to govern.

2. Boundary Design

Boundary design defines the agent’s authority.

What data can it access?
What tools can it use?
What systems can it touch?
What actions can it trigger?
What value threshold requires approval?
What risk level requires escalation?

This is where architecture becomes practical.

An agent that can read information is very different from an agent that can update records, send messages, move money, approve claims, change configurations, or communicate with customers.

Autonomy must be proportional to risk.

3. Semantic Design

Semantic design defines shared meaning.

This is often ignored, but it is critical.

AI agents operate through language, context, and interpretation. If business terms are ambiguous, the agent may make technically valid but operationally wrong decisions.

What does “approved” mean?
What does “urgent” mean?
What does “high risk” mean?
What does “customer consent” mean?
What does “complete evidence” mean?

In traditional systems, ambiguity often hides inside human process.

In agentic systems, ambiguity becomes executable risk.

4. Policy Design

Policy design converts governance into machine-readable constraints.

This is where organisations need to move beyond PDF policies and static documents.

The agent needs runtime constraints it can follow and the platform can enforce.

A policy should define what is allowed, what is blocked, what requires approval, what must be logged, and what must be escalated.

In an agentic world, policy becomes part of the system architecture.

It should be versioned, tested, monitored, and reviewed with the same seriousness as code.

5. Protocol Design

Protocol design defines how the agent coordinates with systems, people, and other agents.

This includes approval flows, handoffs, retries, exception routing, timeout behaviour, and fallback paths.

Agents should not improvise critical coordination patterns.

If a decision requires human approval, the approval protocol must be explicit.

If an external system fails, the recovery protocol must be explicit.

If confidence is low, the escalation protocol must be explicit.

The agent should not be trusted to invent the operating model at runtime.

6. Assurance Design

Assurance design defines how we prove the system behaved correctly.

This includes telemetry, audit trails, evidence capture, decision provenance, simulation results, test outcomes, and post-incident review.

For normal software, observability often focuses on performance and availability.

For agentic systems, observability must also capture intent, context, decision path, policy evaluation, and tool usage.

Operators need to know more than whether the system was up.

They need to know whether the agent was under control.

The SDLC Must Expand

The traditional SDLC is not dead.

We still need requirements, design, build, test, deploy, and operate.

But for AI agents, those steps are no longer enough.

An Agentic SDLC needs additional stages.

It needs intent definition before solution design.
It needs boundary modelling before tool access.
It needs semantic validation before workflow automation.
It needs policy design before deployment.
It needs simulation before production release.
It needs runtime monitoring after go-live.
It needs exception handling as part of the core architecture.
It needs feedback loops that refine the agent’s operating boundaries over time.

This is the shift.

We are not only building applications anymore.

We are building governable systems of action.

That requires a different mindset from both engineering and architecture teams.

The Model Is Not the Architecture

A common mistake in AI discussions is to overfocus on the model.

Which model should we use?
Which vendor is better?
Which benchmark looks stronger?
Which tool has the best demo?

These questions matter, but they are not the architecture.

In production, the model is only the engine.

The real architecture is the system around the model.

The policies.
The data boundaries.
The orchestration layer.
The APIs.
The approval flows.
The monitoring.
The audit trail.
The incident process.
The human escalation path.

A powerful model inside a weak control environment is not an enterprise capability.

It is a risk multiplier.

The organisations that succeed with AI agents will not be the ones that simply adopt the most advanced models.

They will be the ones that design the most governable operating systems around those models.

Change Management Must Also Change

In traditional software delivery, change management focuses on code, infrastructure, configuration, and release artefacts.

In agentic systems, change management needs to expand.

We must version policies.
We must version prompts.
We must version semantic definitions.
We must version tool permissions.
We must version escalation rules.
We must version operating boundaries.

A small change to a policy can change what an agent is allowed to do.

A small change to a prompt can change how it interprets a situation.

A small change to tool access can change the operational risk profile of the entire system.

This is why AI governance cannot sit outside delivery.

It must be embedded into the engineering lifecycle.

The Real Architecture Question

The future of software delivery is not just about faster coding.

That is the shallow conversation.

The deeper conversation is about controlled autonomy.

How do we design systems that can reason, act, and adapt without losing organisational control?

How do we allow agents to create value without allowing them to create unmanaged risk?

How do we move from experimentation to production without pretending that AI agents behave like normal software?

This is where architecture becomes essential.

Not as a documentation function.

Not as a governance bottleneck.

But as the discipline that creates clarity between business ambition, engineering capability, operational risk, and long-term control.

AI agents will force organisations to rediscover the real purpose of architecture.

Architecture is not about drawing diagrams.

Architecture is about designing the conditions under which change can happen safely.

Final Thought

Correct code is no longer enough.

A system can pass every traditional test and still behave in ways the organisation cannot explain, control, or defend.

That is the challenge of agentic software.

The next generation of SDLC must be built around governability.

Not just delivery.

Not just automation.

Not just intelligence.

Governability.

Because when software starts making decisions and taking action, the real measure of success is not whether the system works.

It is whether the system remains under control.

Thinking about introducing AI agents into your business? Don’t start with the model. Start with the control architecture.

I help technology leaders, founders, and executive teams design AI-enabled systems that are safe, governable, and ready for real business operations.

Before you connect an agent to your data, workflows, or customer-facing systems, you need clear answers to four questions:

What is the agent allowed to decide?
What actions can it take?
When must it stop and escalate to a human?
How will you prove later that it acted within policy?

If your organisation is exploring AI agents, automation, or AI-enabled operating models, I can help you assess the architecture, risks, governance model, and delivery approach before the system becomes expensive to fix.

Book a consultation to design your AI agent architecture before you deploy autonomy into the business.

Designed to Scale

Discussion about this post

Ready for more?