Lesson 02 of N

Agentic Workflow Modelling

Designing blueprints for teams of agents that tackle complex tasks together

Jun 2026 ~25 min read Intermediate · Design

The Evolution Path

Every business process is a hybrid of two kinds of work:

The goal is not to maximise intelligence in the system. It is to push intelligence to the edges and keep everything between those edges deterministic, repeatable, and auditable. Agentic workflows add value only where deterministic execution is not feasible. Most teams get this wrong and that is where most initiatives fail.

The Deterministic Boundary Test

Every process has a deterministic skeleton. The task is finding how deep it goes — for some workflows only the top-level sequence is fixed; for others you can push it down to individual field extractions and routing rules. Apply one question to every step:

Can this step be written as explicit code?

If yes, build it as a fixed workflow — use a coding agent (Claude Code, Cursor) to implement the logic if it's complex. The output should be deterministic code, not an LLM deciding at runtime. If no, the step needs judgment or handles inputs that genuinely can't be scripted: use the LLM only there, and keep everything else deterministic.

But even deterministic steps have a boundary. Each one was written for the inputs you anticipated, not the ones you didn't — a malformed field, an unexpected format, an edge case you've never seen. That's where deterministic code fails. If the LLM handles the else branch of each step, the system degrades gracefully instead of breaking. When the LLM catches the same edge case repeatedly, write deterministic code for it and retire the fallback. The boundary expands: the system grows more deterministic over time, not less.

These steps form a natural three-stage path from understanding a process to building it right:

STAGE 1 Describe Map what happens, what decisions are made assess STAGE 2 Automate Can every step be done deterministically? fill gaps STAGE 3 Fill Gaps LLM only for steps that can't be scripted Process fully described Deterministic where possible LLM fills the gaps Start: understand it End: build it right

Example: Customer Support Email Triage

A support email arrives. Here's how to apply the three stages.

Stage 1: Describe
A support email arrives. I need to identify the customer, understand their issue, check their account history, categorise the issue type, and either resolve it directly or pass it to a human with context.
Assess: which steps can be scripted?
Deterministic ✓ → Stage 2: Automate
  • Parse email: extract sender, subject, body
  • Look up customer by email in the database
  • Extract order ID via regex
  • Fetch order status from the orders API
  • Route to escalation queue if SLA breached
Needs LLM ✓ → Stage 3: Fill Gaps
  • Classify issue from free-text (billing? shipping? defect?)
  • Draft a response tailored to the customer's tone and context
  • Decide whether an exception (refund, replacement) is warranted
Stages 2 & 3: Build it
Stage 2 (Automate): a coding agent (Claude Code, Cursor) turns the five deterministic steps into reliable pipeline code. Stage 3 (Fill Gaps): the LLM handles the three steps that genuinely can't be scripted — classification, response drafting, and exception judgment.
And at every deterministic step, the boundary matters. What if the email is a forwarded chain or the order ID is in an image? The deterministic code handles the expected case; the LLM catches what falls outside it. When the same edge case repeats, write a parser for it and retire the fallback.

A nuance: some tasks skip Stage 1 and 2 entirely

This framework assumes you're starting from a deterministic process: something with a clear, repeatable structure you first understand, then encode, then generalise. That fits well for invoice processing, customer triage, report generation, data pipelines.

Some tasks were never deterministic to begin with — writing code, research, open-ended planning — and the agent is the only viable approach from the start. The lesson still applies: don't add an agent to something genuinely deterministic, and don't rigidly script something inherently open-ended.

The Generalisation Step

Agentic workflow modelling is fundamentally an exercise in abstraction. We identify common structure, isolate variation points, and apply cognition only where abstraction breaks down.

Once your workflow is working (deterministic steps built and tested, LLM filling the gaps and catching the unknowns), the next question is scope. Can one workflow handle more than one process? Look across your business for similar processes. Where they share the same steps, keep them fixed. Where they diverge, apply the same question from Stage 2: can this difference be handled deterministically (a routing rule, a conditional, a lookup)? If yes, code it. If no, the step genuinely requires judgment that varies by context: that's where the agent decides. One generalised workflow replaces many one-off automations.

Process A Step 1 Step 2 Step 3 Process B Step 1 Step 2 Step 3 differs! Generalised Agentic Workflow Step 1 Agent Decides Step 3 Shared steps stay fixed. Different steps become agent decision points. same same

The recipe: map multiple similar processes, identify where they diverge, and at each divergence point assess whether the routing can be scripted. If yes, code it. If no, that's where the agent decides.

Implementing This at Scale

Stage 1: Process Decomposition

Strip away the "AI magic" and document the process as raw engineering: inputs, outputs, state changes, and decision trees. The deliverable is a flow diagram or pseudocode that describes the strict sequence of events.

Stages 2 and 3: The Deterministic Boundary

When evaluating the workflow, every step falls into one of three execution types:

Execution Type When to Use Tooling
Pure Deterministic Code Field extraction, API calls, database lookups, static routing Python, TypeScript (built with a coding agent)
LLM as Core Engine Unstructured text classification, synthesis, judgment calls that can't be scripted Structured outputs: JSON mode, Instructor, Pydantic
LLM as Catch-All The else branch of deterministic steps: malformed inputs, formatting anomalies, edge cases not yet in the deterministic parser Fallback handlers, telemetry alerts, refactor triggers

The distinction between rows 2 and 3 matters: "LLM as Core Engine" is a planned, permanent use of the LLM for steps that were never deterministic. "LLM as Catch-All" is a temporary fallback for steps that are deterministic but haven't yet covered a specific edge case. The second kind should shrink over time as you harden the boundary.

The Cost of Cognition

Deterministic and agentic execution have fundamentally different cost profiles. Treat agentic computation as an expensive resource: deploy it only where deterministic execution genuinely cannot do the job.

Execution Type Cost Latency Reliability
Deterministic Lowest Lowest Highest
Agentic Highest Highest Variable

Every step you keep deterministic is a step you run faster, cheaper, and with predictable output. The LLM budget goes further when it is spent only where code and rules genuinely cannot do the job.

Engineering the Expanding Boundary

The hardening loop has a concrete engineering implementation: wrap every deterministic step in a try/except block, route failures to an LLM handler, log every trigger, and refactor when a pattern repeats.

Input Data Deterministic Code try / parse / validate Success Next Step throws error LLM Exception Handler pass: failed input + error signature + label instruction heals data can't resolve Escalate to Human Review Log to Telemetry anomaly type + count Pattern repeats above threshold? No: keep logging Yes Refactor Into Deterministic Code retire LLM fallback for this case Track fallback rate (%), not just absolute count — 3 hits in 10 inputs is very different from 3 in 10,000.

Prompt pattern for the exception handler:

"The deterministic parser failed to extract the Order ID from this input. Analyse the string, extract the ID, and label the structural anomaly format so it can be added to the parser."

The label instruction is critical: without it you collect resolved cases but no dataset to refactor from.

The Agent Router

When consolidating multiple similar processes into one generalised workflow, every divergence point needs a router. Apply the same assess-first rule: never use an LLM router if a deterministic one will do.

Unified Workflow Entry Can routing be based on metadata or static rules? Yes Code-Based Router if/else, regex, DB enum, metadata tag lookup No LLM Semantic Router classify intent with structured output (JSON, fixed enum) small, fast model Deterministic Code Pipeline

Both routes converge back into a deterministic code pipeline. The LLM router's job is narrow: classify intent into a fixed set of options (structured output, not free text), then hand off. The routing decision itself should never be open-ended.

One metric to track

As the boundary expands over time, the proportion of inputs hitting the LLM fallback should fall. Track fallback rate (percentage of inputs that reach the LLM catch-all) per deterministic step. If the rate isn't declining after hardening cycles, the refactor loop isn't closing.

Example threshold: if a step's fallback rate exceeds 5% over a rolling 10,000-input window, raise an automated alert. That rate signals structural drift in input formats, not random noise, and warrants an immediate refactor cycle rather than leaving the LLM to absorb it indefinitely.

Workflows vs Chatbots

Both use LLMs, but they're fundamentally different things:

Agentic Workflow

  • Has a start and an end
  • The agents decide when to stop
  • Goal: complete a defined task
  • Can reflect, iterate, find different paths
Start
Agent A
Agent B
Done

Chatbot

  • No defined end: runs until user stops
  • The user decides when to stop
  • Goal: continuous conversation
  • Reactive: responds to user input
User
Bot
User

What Makes Workflows Powerful

Even though workflows have a defined end, they're not rigid. Agents within them can reflect on their work, retry with improvements, and choose different execution paths. Structure plus flexibility is what separates them from both chatbots and fixed-script automation.

The Modelling Playbook

Every workflow has two dimensions: the process (what steps happen and in what order) and the agents (who decides and acts at each node). The diagram below shows what each lens looks like in practice.

Process-Centric (Deterministic) Task 1: Receive Task 2: Validate Valid? yes no Task 3: Process Focus: sequence of tasks rigid connections, fixed paths Agent-Centric (Agentic) Planner Agent Search Validate Generate Review feedback loop Focus: agent capabilities + decisions flexible paths, feedback loops

The Shift in Thinking

Process-centric asks what are the steps? Agent-centric asks who handles this, and what can they decide? Start with the process to get the structure. Then identify which nodes need an agent — and leave everything else deterministic.

Design Checklist: How to Spec a Workflow

Deterministic

  • List all tasks, start to finish
  • Map sequence + dependencies
  • Document inputs/outputs per step
  • Define decision points (if/then rules)
  • Write execution functions
  • Visualise as flowchart
  • Validate + test

Done once. Iterate only if tests fail.

Agentic

  • Agent capabilities: what each agent can do, decide, and act on
  • Goal structure: objectives and how success is measured
  • Environment constraints: boundaries agents must operate within

This is a spec: it describes what you configure. But what makes it agentic is what happens after deployment ↓

The Operational Cycle: What Makes It Agentic

The spec above is just the starting point. What makes a system agentic is that it runs in a loop: each outcome feeds back and shapes what the agent does next. (This adapting happens within a single run — carrying it across runs needs memory, since the model itself isn't retrained.) For the loop to work, the agent has to remember the run so far: each step's result is collected and passed into the next LLM call. Without that running record, every call starts blank and the feedback has nowhere to land.

The Agentic Cycle Capability what the agent can do + how it reasons Decision agent evaluates options, picks a path Outcome result of the action — success or failure Feedback evaluate what worked, what didn't refines how agent behaves every rotation sharpens the run

Why This Matters

A deterministic workflow runs the same way every time. That is the right choice for stable, auditable processes. An agentic workflow improves with every rotation within a run. Outcomes feed back into capabilities: the agent adjusts its reasoning, tries different paths, and gets better at achieving its goals, though carrying those lessons across runs requires adding memory.

Architectural Blueprinting

Architectural blueprinting is the practice of translating a workflow into a diagram that anyone on the team can read and build from. The conventions below define the shared visual language — you’ll see them applied in the Risk Assessment case study that follows.

Visual Vocabulary Task / Action Rectangle = a step an agent performs ? Diamond = a decision point (branching) Agent A Agent B data Arrow = data or control flow JSON Dashed = data object (payload, schema) Use these consistently. Anyone should be able to read your diagram cold.

Four Rules for Clear Diagrams

1. Standard Symbols

Rectangles for tasks, diamonds for decisions, arrows for flow. Pick a convention and stick to it. Consistency beats creativity in diagrams.

2. Clear Labelling

Do: "Fetch User Preferences"
Don't: "Get Data"
Names should tell you what happens without reading docs.

3. Show Inputs & Outputs

Every task consumes something and produces something. Label your arrows or add data annotations. This reveals dependencies and helps with debugging.

4. Right Granularity

Start high-level ("Process Order"), zoom in when needed ("Check Inventory" → "Process Payment"). Match detail level to your audience. Executives vs engineers see different diagrams.

Example: Granularity Levels

High level Process Order zoom Detailed Check Inventory Charge Payment Ship Package Send Confirmation One box becomes four — each with its own inputs, outputs, and potential agent. The right zoom level depends on who's reading the diagram.

The Agentic Litmus Test

The agentic litmus test comes down to a single question:

Is the execution path fixed at design time, or determined at runtime?

You can apply this test to an entire workflow or zoom into a single task within it.

A useful rule of thumb:

Known process → Workflow

Known goal, unknown process → Agent

NOT AGENTIC Linear — fixed steps in fixed order Start Step 1 Step 2 Step 3 Step 4 Step 5 End Each step could use AI or an LLM — doesn't matter. The workflow is still a fixed script. NOT AGENTIC Conditional loop — branches but paths are predetermined Start Step 1 Step 2 Test yes Step 4 End no → retry Even the test can use AI. But we always know: yes → Step 4, no → back to Step 1. No surprises. CLOSER, BUT STILL NOT Task selection — picks a path, but from a fixed menu Start Select Task Task A Task B Task C Next Step End More flexible — but it's always "pick from {A, B, C}." The menu itself is fixed at design time. WHAT MAKES IT AGENTIC The workflow itself figures out what to do — not from a fixed menu, but by setting goals, generating tasks, and adapting based on results. Steps are generated at runtime, not defined at design time The workflow can change its own plan based on what it discovers There's no fixed path — every run can look completely different

Case Study: Risk Assessment

A bank needs to evaluate a loan application and produce a risk decision. To do this, it may need to collect customer data, validate it, calculate a risk score, categorise the risk, and make a final decision.

The question is not what to do. The question is how to structure the work: does every application follow the same fixed sequence, or does the system decide what to investigate based on what it finds?

Version 1: Deterministic

The workflow is defined at design time. Five steps, always in the same order, always the same path.

Input Data Collection Validation Risk Scoring Categorisation Decision Output Always the same 5 steps, always the same order, always the same path to output.

Both applicants go through the same process. Only the outcome differs.

Applicant A

Collect → Validate → Score → Categorise → Approve

Applicant B

Collect → Validate → Score → Categorise → Reject

This is appropriate when regulations require consistency, inputs are well understood, and every decision must be auditable by the same standard.

Version 2: Agentic

Instead of a fixed sequence, a planner decides what to investigate next based on what it discovers. The path is not set at the start; it emerges from the evidence.

Input Action Planner picks next task Location Market Cost picks one per cycle Evaluator checks quality Decision goal met? yes Output no — re-plan: different task, different order Same five tasks. But which runs, in what order, and how many cycles — all decided at runtime.

Two applications arrive. The planner chooses a completely different investigation path for each.

Run 1: Property investment

Check location risk

Finding: high flood risk

Check insurance costs

Finding: extremely high

Decision: Reject

Path: Location → Insurance → Decision (2 steps)

Run 2: Development application

Check market conditions

Finding: stable

Check construction costs

Finding: rising rapidly

Check developer financials

Finding: weak balance sheet

Decision: Reject

Path: Market → Cost → Financials → Decision (3 steps)

Different paths. Different number of steps. Same overall goal.

Why This Is Truly Agentic

In the deterministic version, the execution path is known at design time:

Collect → Validate → Score → Categorise → Decision

In the agentic version, only the goal is known at design time:

Produce a risk decision.

Everything else is determined at runtime. The system must figure out:

Those decisions are made by the planner at runtime, based on what each investigation reveals. That is what makes it genuinely agentic.

Enterprise Parallel

The same distinction appears in enterprise due diligence. A deterministic approach assigns every acquisition the same four reviews:

Financial → Legal → Security → ESG → Decision

An agentic approach starts with a goal and follows the evidence. The first finding reveals cybersecurity concerns, so the next step is infrastructure. That reveals legacy systems, so the next step is compliance exposure, then third-party vendors. The investigation path emerges from what is discovered, not from a checklist defined before the work began.

The Lesson

The risk assessment case study is a direct illustration of the agentic litmus test. In the deterministic version, the path is fixed at design time. In the agentic version, the system decides what to investigate next, what information to gather, and when enough evidence exists to conclude. Every case may require a completely different investigation path while pursuing the same objective. That is the test.

Real Agentic Patterns

Once a workflow decides its own steps, real patterns emerge. Here are four architectures used in practice:

1. Goal-Setting Loop

The workflow starts with a goal, not a step. It plans tasks, executes them with tools, evaluates results, and loops until the goal is met.

Input Set Goal what are we solving? Plan Tasks generate steps Execute agents + tools search calc ... Evaluate goal met? yes Output no → re-plan and retry Tasks are generated, not predefined. Every run can produce different steps.

2. Group Chat Pattern (e.g. AutoGen)

A chat manager coordinates multiple agents: here an assistant and a user proxy. AutoGen (a multi-agent framework from Microsoft) popularised this pattern. The assistant works; the proxy acts as a stand-in for the human and judges when the job is done.

Input Chat Manager monitors + coordinates Assistant does the work User Proxy judges as the human would iterate until proxy is satisfied Output

3. Worker + Critic (Nested)

A worker generates output, a critic evaluates it, and a user proxy decides when the result is good enough.

Input User Proxy final judge Worker generates answer Critic evaluates quality revise approved → return Output

4. Crew Pattern (e.g. CrewAI)

A manager receives the goal and delegates to specialised agents, each with their own sub-tasks and tools. Agents can work in parallel (parallel execution is opt-in; CrewAI defaults to sequential), which is key to efficient agentic systems. Scales to many agents.

Parallel execution warning. When agents run concurrently and write to a shared memory object or state context, their outputs can conflict or overwrite each other. Parallel execution requires a deterministic orchestration layer (directed acyclic graph (DAG) or map-reduce) to synchronise the join before a manager synthesises results. Treat shared state as a critical section: one writer at a time, or use isolated output slots that the manager merges.

Input Crew Manager delegates, then merges delegate Research Agent Gather Info Evaluate Sources Analyse Data Web API Database Scraper → Research Report ∥ run in parallel Writer Agent Draft Content Edit + Polish Format Output Templates Grammar → Draft Article Final Output merged by manager

The Common Thread

Every agentic pattern shares three traits: goals are set at runtime (not hardcoded), steps are generated (not predefined), and feedback loops drive iteration (not just retry-on-failure). The differences are in who coordinates: a goal loop, a chat manager, a critic, or a crew manager.

Agent Building Blocks

When modelling a workflow, you're wiring together agents of different types. Seven common ones, ordered from simple to sophisticated:

Simple Complex 1. Direct Prompt Raw LLM call. No persona, no context, no tools — just the question. User Prompt LLM Response "What's the capital of France?" → "Paris" 2. Augmented Prompt Adds a system message — persona shapes the response. Persona User Prompt LLM Styled 3. Knowledge Augmented Persona + curated knowledge. LLM is instructed to answer only from the provided knowledge. Persona Knowledge Prompt LLM Curated 4. Retrieval-Augmented Generation (RAG) Dynamically retrieves relevant docs before answering. Flexible, less hallucination. Query Retrieve embeddings LLM Grounded Key difference from #3: knowledge is searched dynamically, not pre-loaded. Note: RAG is an augmentation pattern on a prompt, not an agent role — any agent type above can use retrieval. 5. Evaluation Quality controller. Reviews other agents' output, sends back for revision if needed. Worker Evaluator Approved retry with feedback 6. Routing Project manager. Classifies the task, routes to the right specialist. Router Math Agent History Agent Code Agent often uses embeddings + similarity to classify 7. Action Planning Strategist. Takes a vague goal, breaks it into a plan of executable steps. Complex Goal Planner decomposes Step 1: Research Step 2: Draft Step 3: Review Unlike Router (who does it?), Planner decides what needs doing in the first place

Choosing the Right Building Block

Match agent type to the job: Direct/Augmented for simple single-step tasks. Knowledge/RAG when accuracy matters more than creativity. Evaluation when quality needs a second pass. Routing when you need to dispatch across specialists. Planning when the steps themselves are unknown.

The Principle

The goal is not to maximise intelligence. The goal is to minimise the amount of intelligence required.

Every successful agentic system is mostly deterministic with carefully placed islands of cognition. The intelligence is not the architecture. It is a component within one.

Lesson Recap

What You Now Know

← 01 Agentic Fundamentals 03 Implementation →