Software Engineering in the Age of Agents

Prof G-J van Rooyen | What's Working in AI | Capitec x Octoco AI

Events

Updates

CTO Insights

Towards the end of last year, I was working on a project for one of our clients, deep into a large and complex software engagement. We had a well-structured codebase and had managed to abstract the problem in a way that let us take an enormous volume of documentation and translate it into code in a systematic way.

At some point, the client asked the obvious question: What are we in for? What is the scope and budget to complete the project?

So we did what any responsible engineering team would do. We looked at the work already completed, the time spent on it, and made a conservative estimate through linear extrapolation. The number we came back with was significant, and it was not what the client had anticipated or budgeted for.

We saw an opportunity to test something we'd been eager to try. If we set AI loose on this problem, the cost in compute tokens would be a rounding error compared to the original estimate.

What followed took about an afternoon of work, with the AI generating a volume of code that would have taken a team weeks to produce manually.

A few caveats:

This happened to be a very AI-amenable codebase.
The abstractions were already well structured, and the underlying rules existed as natural-language documentation that could be cleanly translated into code.
We also made clear to the client that the work still required validation, real-world testing, and proper proofing.

But something happened in that moment that genuinely made my eyes widen. We had just collapsed what was shaping up to be a very large project into an afternoon. And as a software consultancy, a business whose model is built on selling hours of coding work where the hour serves as a proxy for value delivered, that was not a comfortable realisation.

Agentic Systems and Why They're Sensitive

When you start building systems with agents, they begin to resemble control systems in important ways, and control systems are notoriously sensitive to what happens in their feedback loops.

Control system engineers spend considerable time worrying about stability and fine-tuning. The same concern applies to agentic systems. These systems are sensitive to the tools available to the agent, the prompts that define its intent, and the context, which is the full history of messages the model receives. Large Language Models (LLMs) don't actually carry memory between turns; each time they generate a response, that memory has to be provided again as part of the context window.

This means that prompt design, tool selection, and context management become primary engineering concerns when building reliable systems with LLMs.

How Coding Has Evolved

After OpenAI’s 2022 Christmas Gift of ChatGPT, powered by the brand-new GPT3 LLM, these new inference engines were rapidly applied to help software developers work more efficiently.

2023 was the year of Coding Autosuggest. GitHub Copilot was right there with you in the IDE, gently suggesting how you might want to finish your current ask.

2024 saw competitive models evolve, and challengers like Anthropic’s Claude seemed really good at understanding and generating code. This was the year of Chat-Oriented Programming, where developers used chatbots as expert systems, whether inside the IDE or simply by pasting, chatting, and copying.

2025 brought deep integration into programming ecosystems. The LLMs no longer powered mere chatbots – they were Pair Programming coding agents working shoulder-to-shoulder with the developer to create code.

2026 might bring the slowly disappearing IDE. Consider this quote from Boris Cherney, one of the creators of Claude Code:

The last month was my first month as an engineer that I didn’t open an IDE at all. Opus 4.5 wrote around 200 PRs, every single line. Software engineering is radically changing, and the hardest part, even for early adopters and practitioners like us, is to continue to re-adjust our expectations. And this is still just the beginning. [26 December 2025, X]

What Happens to the Billable Hour

This raises an uncomfortable question for a software consultancy.

If the time required to generate code approaches zero, how do we value software when the cost of generating it collapses?

Our hypothesis is that the value shifts to the edges of the production process.

Upstream, the value shifts to requirements analysis. When creating code is cheap and fast, it becomes easier than ever to quickly build the wrong thing. The real skill becomes understanding what should actually be built.

Downstream, the value shifts to reliable deployment. Delivering software that runs reliably, scales properly, within a real production environment remains extremely difficult.

Neither of these is a task you can shortcut your way through.

We're also seeing organisations form Applied AI teams focused on building highly customised internal software rather than relying entirely on off-the-shelf tools. When generation is cheap, building something tailored to your exact situation becomes far more accessible.

Engineering Reliable Systems With Fuzzy Components

One of the harder engineering challenges we're navigating is how to build reliable systems when some of the components are inherently unpredictable.

LLM-based systems are somewhat reminiscent of nonlinear control systems. In the forward path is a system-under-control (the LLM) that may be noisy and unpredictable. The feedback path contains components that you can modulate: the system prompt, the tools available to the agent, and the context (conversation history, real or manipulated) provided to the model during each control step. Getting these right, and keeping them right, requires genuine engineering discipline.

A recent project that we worked on illustrates this well. The client had developed a prototype and needed help taking it to production. There are still relatively few teams capable of taking an LLM-based prototype and engineering it into a reliable product, and that gap between proof-of-concept and production is where much of the real engineering work lives.

The product needed to drive a structured, goal-directed conversation with a user, which introduced an interesting design challenge. In a typical chatbot interaction, the human steers the conversation and the model responds. This use case inverts that. The system needs to control the conversation, guide it through specific topics, and prevent the user from pulling it off course.

The first version used a chain of micro-agents, each focused on a specific topic, with each agent handling a portion of the conversation before handing off to the next. The current production system uses a deterministic orchestrator that launches sub-agents and continuously modifies their context to keep things on track. As engineering teams everywhere continue to work on building unpredictable LLMs into reliable systems, more of these design patterns will emerge.

Agentic Operations

If the cost of generating code collapses, the bottleneck moves. Increasingly, it shifts to the operations surrounding code generation, and coding agents are just the first visible example of a much broader pattern.

At Octoco, we're experimenting with what we call Agentic Ops:

Using agents to assist with operational work across the organisation. A small piece of custom automation I built generates monthly reports from our timesheet system. Highly specific to our organisation, but straightforward to build with agentic tools. The time freed up goes toward the work that actually requires human judgment.

This points to a broader economic shift. When building internal tools becomes cheap, organisations can create highly customised solutions instead of relying on SaaS products. That starts to erode some traditional SaaS value propositions, and if you're in the business of selling software, it's worth thinking carefully about how that evolves.

The patterns here also help explain something we observe regularly: surveys frequently show that around 95% of organisations applying AI don't see meaningful improvements. That number suggests the true bottleneck may not be code generation. It's somewhere else in the organisation: requirements, operations, deployment, process design – many of these may have, over many years, fallen into lockstep with the traditionally slow software development process.

Finding and addressing those constraints is the harder problem.

Where This Leaves Us

The business of software is transforming quickly. We're at a point of technological transformation across our industry and broader society that will fundamentally affect how we work and what we build.

The constraints and bottlenecks are shifting. The value is shifting. The skills that matter most are shifting. Engineering with LLMs and building reliable agentic systems are genuinely new fields, and we're still in the early stages of understanding what good practice looks like.

Everything is changing. The question is whether your organisation changes with it or waits until the bottlenecks become impossible to ignore.