The Evolution Path
Every business process is a hybrid of two kinds of work:
- Instruction-ready work — work where you can give a machine explicit instructions: simple if/then rules, algorithmic transformations, data parsing, API calls, analytical tasks.
- Intuition-based work — unstructured language, messy human interactions, subjective judgment. For most of computing history, this resisted machine automation entirely. Large Language Models (LLMs) changed that: they can now handle significant portions of it, up to a point.
The goal is not to maximise intelligence in the system. It is to push intelligence to the edges and keep everything between those edges deterministic, repeatable, and auditable. Agentic workflows add value only where deterministic execution is not feasible. Most teams get this wrong and that is where most initiatives fail.
The Deterministic Boundary Test
Every process has a deterministic skeleton. The task is finding how deep it goes — for some workflows only the top-level sequence is fixed; for others you can push it down to individual field extractions and routing rules. Apply one question to every step:
Can this step be written as explicit code?
If yes, build it as a fixed workflow — use a coding agent (Claude Code, Cursor) to implement the logic if it's complex. The output should be deterministic code, not an LLM deciding at runtime. If no, the step needs judgment or handles inputs that genuinely can't be scripted: use the LLM only there, and keep everything else deterministic.
But even deterministic steps have a boundary. Each one was written for the inputs you anticipated, not the ones you didn't — a malformed field, an unexpected format, an edge case you've never seen. That's where deterministic code fails. If the LLM handles the else branch of each step, the system degrades gracefully instead of breaking. When the LLM catches the same edge case repeatedly, write deterministic code for it and retire the fallback. The boundary expands: the system grows more deterministic over time, not less.
These steps form a natural three-stage path from understanding a process to building it right:
Example: Customer Support Email Triage
A support email arrives. Here's how to apply the three stages.
- Parse email: extract sender, subject, body
- Look up customer by email in the database
- Extract order ID via regex
- Fetch order status from the orders API
- Route to escalation queue if SLA breached
- Classify issue from free-text (billing? shipping? defect?)
- Draft a response tailored to the customer's tone and context
- Decide whether an exception (refund, replacement) is warranted
A nuance: some tasks skip Stage 1 and 2 entirely
This framework assumes you're starting from a deterministic process: something with a clear, repeatable structure you first understand, then encode, then generalise. That fits well for invoice processing, customer triage, report generation, data pipelines.
Some tasks were never deterministic to begin with — writing code, research, open-ended planning — and the agent is the only viable approach from the start. The lesson still applies: don't add an agent to something genuinely deterministic, and don't rigidly script something inherently open-ended.
The Generalisation Step
Agentic workflow modelling is fundamentally an exercise in abstraction. We identify common structure, isolate variation points, and apply cognition only where abstraction breaks down.
Once your workflow is working (deterministic steps built and tested, LLM filling the gaps and catching the unknowns), the next question is scope. Can one workflow handle more than one process? Look across your business for similar processes. Where they share the same steps, keep them fixed. Where they diverge, apply the same question from Stage 2: can this difference be handled deterministically (a routing rule, a conditional, a lookup)? If yes, code it. If no, the step genuinely requires judgment that varies by context: that's where the agent decides. One generalised workflow replaces many one-off automations.
The recipe: map multiple similar processes, identify where they diverge, and at each divergence point assess whether the routing can be scripted. If yes, code it. If no, that's where the agent decides.
Implementing This at Scale
Stage 1: Process Decomposition
Strip away the "AI magic" and document the process as raw engineering: inputs, outputs, state changes, and decision trees. The deliverable is a flow diagram or pseudocode that describes the strict sequence of events.
Stages 2 and 3: The Deterministic Boundary
When evaluating the workflow, every step falls into one of three execution types:
| Execution Type | When to Use | Tooling |
|---|---|---|
| Pure Deterministic Code | Field extraction, API calls, database lookups, static routing | Python, TypeScript (built with a coding agent) |
| LLM as Core Engine | Unstructured text classification, synthesis, judgment calls that can't be scripted | Structured outputs: JSON mode, Instructor, Pydantic |
| LLM as Catch-All | The else branch of deterministic steps: malformed inputs, formatting anomalies, edge cases not yet in the deterministic parser | Fallback handlers, telemetry alerts, refactor triggers |
The distinction between rows 2 and 3 matters: "LLM as Core Engine" is a planned, permanent use of the LLM for steps that were never deterministic. "LLM as Catch-All" is a temporary fallback for steps that are deterministic but haven't yet covered a specific edge case. The second kind should shrink over time as you harden the boundary.
The Cost of Cognition
Deterministic and agentic execution have fundamentally different cost profiles. Treat agentic computation as an expensive resource: deploy it only where deterministic execution genuinely cannot do the job.
| Execution Type | Cost | Latency | Reliability |
|---|---|---|---|
| Deterministic | Lowest | Lowest | Highest |
| Agentic | Highest | Highest | Variable |
Every step you keep deterministic is a step you run faster, cheaper, and with predictable output. The LLM budget goes further when it is spent only where code and rules genuinely cannot do the job.
Engineering the Expanding Boundary
The hardening loop has a concrete engineering implementation: wrap every deterministic step in a try/except block, route failures to an LLM handler, log every trigger, and refactor when a pattern repeats.
Prompt pattern for the exception handler:
The label instruction is critical: without it you collect resolved cases but no dataset to refactor from.
The Agent Router
When consolidating multiple similar processes into one generalised workflow, every divergence point needs a router. Apply the same assess-first rule: never use an LLM router if a deterministic one will do.
Both routes converge back into a deterministic code pipeline. The LLM router's job is narrow: classify intent into a fixed set of options (structured output, not free text), then hand off. The routing decision itself should never be open-ended.
One metric to track
As the boundary expands over time, the proportion of inputs hitting the LLM fallback should fall. Track fallback rate (percentage of inputs that reach the LLM catch-all) per deterministic step. If the rate isn't declining after hardening cycles, the refactor loop isn't closing.
Example threshold: if a step's fallback rate exceeds 5% over a rolling 10,000-input window, raise an automated alert. That rate signals structural drift in input formats, not random noise, and warrants an immediate refactor cycle rather than leaving the LLM to absorb it indefinitely.
Workflows vs Chatbots
Both use LLMs, but they're fundamentally different things:
Agentic Workflow
- Has a start and an end
- The agents decide when to stop
- Goal: complete a defined task
- Can reflect, iterate, find different paths
Chatbot
- No defined end: runs until user stops
- The user decides when to stop
- Goal: continuous conversation
- Reactive: responds to user input
What Makes Workflows Powerful
Even though workflows have a defined end, they're not rigid. Agents within them can reflect on their work, retry with improvements, and choose different execution paths. Structure plus flexibility is what separates them from both chatbots and fixed-script automation.
The Modelling Playbook
Every workflow has two dimensions: the process (what steps happen and in what order) and the agents (who decides and acts at each node). The diagram below shows what each lens looks like in practice.
The Shift in Thinking
Process-centric asks what are the steps? Agent-centric asks who handles this, and what can they decide? Start with the process to get the structure. Then identify which nodes need an agent — and leave everything else deterministic.
Design Checklist: How to Spec a Workflow
Deterministic
- List all tasks, start to finish
- Map sequence + dependencies
- Document inputs/outputs per step
- Define decision points (if/then rules)
- Write execution functions
- Visualise as flowchart
- Validate + test
Done once. Iterate only if tests fail.
Agentic
- Agent capabilities: what each agent can do, decide, and act on
- Goal structure: objectives and how success is measured
- Environment constraints: boundaries agents must operate within
This is a spec: it describes what you configure. But what makes it agentic is what happens after deployment ↓
The Operational Cycle: What Makes It Agentic
The spec above is just the starting point. What makes a system agentic is that it runs in a loop: each outcome feeds back and shapes what the agent does next. (This adapting happens within a single run — carrying it across runs needs memory, since the model itself isn't retrained.) For the loop to work, the agent has to remember the run so far: each step's result is collected and passed into the next LLM call. Without that running record, every call starts blank and the feedback has nowhere to land.
Why This Matters
A deterministic workflow runs the same way every time. That is the right choice for stable, auditable processes. An agentic workflow improves with every rotation within a run. Outcomes feed back into capabilities: the agent adjusts its reasoning, tries different paths, and gets better at achieving its goals, though carrying those lessons across runs requires adding memory.
Architectural Blueprinting
Architectural blueprinting is the practice of translating a workflow into a diagram that anyone on the team can read and build from. The conventions below define the shared visual language — you’ll see them applied in the Risk Assessment case study that follows.
Four Rules for Clear Diagrams
1. Standard Symbols
Rectangles for tasks, diamonds for decisions, arrows for flow. Pick a convention and stick to it. Consistency beats creativity in diagrams.
2. Clear Labelling
Do: "Fetch User Preferences"
Don't: "Get Data"
Names should tell you what happens without reading docs.
3. Show Inputs & Outputs
Every task consumes something and produces something. Label your arrows or add data annotations. This reveals dependencies and helps with debugging.
4. Right Granularity
Start high-level ("Process Order"), zoom in when needed ("Check Inventory" → "Process Payment"). Match detail level to your audience. Executives vs engineers see different diagrams.
Example: Granularity Levels
The Agentic Litmus Test
The agentic litmus test comes down to a single question:
Is the execution path fixed at design time, or determined at runtime?
You can apply this test to an entire workflow or zoom into a single task within it.
- The Signal: If a component follows a predefined sequence of steps with well-defined inputs and outputs, it should be deterministic. If the component must decide what information to gather, which tools to use, which actions to take next, or when the task is complete, that is where an agent belongs.
- The Reality: A workflow can use AI at every single node and remain entirely deterministic. Conversely, a single autonomous task embedded within an otherwise rigid pipeline can be genuinely agentic.
A useful rule of thumb:
Known process → Workflow
Known goal, unknown process → Agent
Case Study: Risk Assessment
A bank needs to evaluate a loan application and produce a risk decision. To do this, it may need to collect customer data, validate it, calculate a risk score, categorise the risk, and make a final decision.
The question is not what to do. The question is how to structure the work: does every application follow the same fixed sequence, or does the system decide what to investigate based on what it finds?
Version 1: Deterministic
The workflow is defined at design time. Five steps, always in the same order, always the same path.
Both applicants go through the same process. Only the outcome differs.
Applicant A
Collect → Validate → Score → Categorise → Approve
Applicant B
Collect → Validate → Score → Categorise → Reject
This is appropriate when regulations require consistency, inputs are well understood, and every decision must be auditable by the same standard.
Version 2: Agentic
Instead of a fixed sequence, a planner decides what to investigate next based on what it discovers. The path is not set at the start; it emerges from the evidence.
Two applications arrive. The planner chooses a completely different investigation path for each.
Run 1: Property investment
Check location risk
Finding: high flood risk
Check insurance costs
Finding: extremely high
Decision: Reject
Path: Location → Insurance → Decision (2 steps)
Run 2: Development application
Check market conditions
Finding: stable
Check construction costs
Finding: rising rapidly
Check developer financials
Finding: weak balance sheet
Decision: Reject
Path: Market → Cost → Financials → Decision (3 steps)
Different paths. Different number of steps. Same overall goal.
Why This Is Truly Agentic
In the deterministic version, the execution path is known at design time:
Collect → Validate → Score → Categorise → Decision
In the agentic version, only the goal is known at design time:
Produce a risk decision.
Everything else is determined at runtime. The system must figure out:
- Which investigations are needed
- How many investigations are needed
- What order to conduct them in
- When enough evidence has been gathered to decide
Those decisions are made by the planner at runtime, based on what each investigation reveals. That is what makes it genuinely agentic.
Enterprise Parallel
The same distinction appears in enterprise due diligence. A deterministic approach assigns every acquisition the same four reviews:
Financial → Legal → Security → ESG → Decision
An agentic approach starts with a goal and follows the evidence. The first finding reveals cybersecurity concerns, so the next step is infrastructure. That reveals legacy systems, so the next step is compliance exposure, then third-party vendors. The investigation path emerges from what is discovered, not from a checklist defined before the work began.
The Lesson
The risk assessment case study is a direct illustration of the agentic litmus test. In the deterministic version, the path is fixed at design time. In the agentic version, the system decides what to investigate next, what information to gather, and when enough evidence exists to conclude. Every case may require a completely different investigation path while pursuing the same objective. That is the test.
Real Agentic Patterns
Once a workflow decides its own steps, real patterns emerge. Here are four architectures used in practice:
1. Goal-Setting Loop
The workflow starts with a goal, not a step. It plans tasks, executes them with tools, evaluates results, and loops until the goal is met.
2. Group Chat Pattern (e.g. AutoGen)
A chat manager coordinates multiple agents: here an assistant and a user proxy. AutoGen (a multi-agent framework from Microsoft) popularised this pattern. The assistant works; the proxy acts as a stand-in for the human and judges when the job is done.
3. Worker + Critic (Nested)
A worker generates output, a critic evaluates it, and a user proxy decides when the result is good enough.
4. Crew Pattern (e.g. CrewAI)
A manager receives the goal and delegates to specialised agents, each with their own sub-tasks and tools. Agents can work in parallel (parallel execution is opt-in; CrewAI defaults to sequential), which is key to efficient agentic systems. Scales to many agents.
Parallel execution warning. When agents run concurrently and write to a shared memory object or state context, their outputs can conflict or overwrite each other. Parallel execution requires a deterministic orchestration layer (directed acyclic graph (DAG) or map-reduce) to synchronise the join before a manager synthesises results. Treat shared state as a critical section: one writer at a time, or use isolated output slots that the manager merges.
The Common Thread
Every agentic pattern shares three traits: goals are set at runtime (not hardcoded), steps are generated (not predefined), and feedback loops drive iteration (not just retry-on-failure). The differences are in who coordinates: a goal loop, a chat manager, a critic, or a crew manager.
Agent Building Blocks
When modelling a workflow, you're wiring together agents of different types. Seven common ones, ordered from simple to sophisticated:
Choosing the Right Building Block
Match agent type to the job: Direct/Augmented for simple single-step tasks. Knowledge/RAG when accuracy matters more than creativity. Evaluation when quality needs a second pass. Routing when you need to dispatch across specialists. Planning when the steps themselves are unknown.
The Principle
The goal is not to maximise intelligence. The goal is to minimise the amount of intelligence required.
Every successful agentic system is mostly deterministic with carefully placed islands of cognition. The intelligence is not the architecture. It is a component within one.
Lesson Recap
What You Now Know
- Conceptual modelling: how to think about complex processes beyond simple linear steps (evolution path, generalisation, the agentic litmus test)
- Agent roles: how to define distinct agent types and responsibilities, from direct prompt to action planning, and wire them into workflows
- Parallel processing: how agentic patterns like crew managers and parallelisation let multiple agents work simultaneously for efficiency
- Deterministic vs agentic: when each approach fits, what makes a workflow truly agentic (runtime planning, dynamic paths, feedback loops), and how to evolve one into the other