Lesson 01 of N

What Makes an AI Agent an Agent?

Understanding the line between automation and agentic workflows

Jun 2026 ~20 min read Beginner · Conceptual

What Is an AI Agent?

An LLM (Large Language Model) — the reasoning engine is the brain. An Agent — a system that perceives, decides, and acts is the whole body. It uses the LLM to understand, reason, and decide — but also acts on those decisions.

The Four Pillars

Perceive
its environment
Reason
about observations
Plan
to achieve goals
Act
execute the plan

These pillars describe what an agent does at runtime. The five components below describe how you build one — Persona and Knowledge shape Perceive/Reason, Prompting drives Plan, Tools enable Act, and Interaction connects the agent to its environment.

Automation vs Agentic

❌ Deterministic Workflow

Fixed input → fixed output, rigid rules, no decision-making.

Efficient but brittle.

Subscribe
Welcome email
Wait 3d
Follow-up

Robotic Process Automation (RPA), drip campaigns, cron jobs — same category.

✅ Agentic Workflow

A dynamic process — adapts, decomposes tasks, routes intelligently.

The agent decides what to do, not just how.

  • Adapts to new information
  • Decides its own steps
  • Handles novel situations

The Core Shift

Using an LLM doesn't make something an agent. What makes it agentic is the move from fixed script to goal-oriented process.

One nuance: this framing assumes you're starting from a deterministic process — something with a clear, repeatable structure (invoice processing, customer triage, report generation). Some tasks — writing code, research, open-ended planning — were never deterministic to begin with. Those are inherently agentic from the start; there's no fixed-script version to automate first. The mistake to avoid in both directions: don't add an agent to something that's genuinely deterministic (unnecessary complexity), and don't build rigid automation for something that's inherently open-ended (it'll break constantly).

Components of an Agent

Agent Persona Knowledge Prompting Tools Interaction

Deep Dive: Persona

The system prompt that narrows a general-purpose LLM into a focused identity. Same LLM, different persona = completely different agent.

Controls: role, tone, boundaries, output format

Formal Analyst

"You are a senior financial analyst. Use formal language. Always cite data sources. Output in structured tables."

User: "How's AAPL doing?"

Agent: [illustrative] "AAPL is trading at $XXX (+X.X%). Q3 revenue exceeded estimates by ~5%. See table below..."

Friendly Support Bot

"You are a friendly customer support assistant. Use casual language. Keep answers short and helpful. Add emojis."

User: "How's AAPL doing?"

Agent: [illustrative] "Apple's doing great today! Stock is up a bit — not bad at all! Anything else I can help with? 🍎"

Same LLM, same question — completely different responses. That's the power of persona.

Deep Dive: Knowledge

Not just training data — it's everything the agent can access. Four layers, from broad to specific:

  • LLM Training — broad, static, general
  • Fine-tuning — domain-specific training
  • Tools — live data via APIs, search, databases
  • Memory — short-term (this chat) and long-term (past sessions)

Example: Same Agent, Different Knowledge Layers

User asks: "What's the return policy?"

LLM Training Only
"Most retailers offer 30-day return policies..." (generic, possibly wrong)
+ Tools (Retrieval-Augmented Generation)
"Per our policy doc: 14-day returns, receipt required, electronics are final sale."

Adding the right knowledge layer transforms a guess into a fact.

Deep Dive: Prompting Strategy

The blueprint for how the agent talks to its LLM. It's not just the user's question — it's everything around it.

System Prompt persona + constraints Context memory + tool outputs User Query the actual question Full Prompt assembled LLM processes Response text / JSON / action

Prompting Techniques

Zero-shot
Just instructions, no examples
Few-shot
Include examples to guide style
Chain-of-thought
"Think step by step"
ReAct
Reason + Act in loops

Example: Same Question, Four Techniques

Question: "Is 17 a prime number?"

Zero-shot
"Is 17 prime?"
→ "Yes" (no guidance given)
Few-shot
"7→prime, 9→not prime. Is 17 prime?"
→ "Prime" (follows the pattern)
Chain-of-thought
"Think step by step."
→ "17 ÷ 2 = 8.5, ÷ 3 = 5.67... no divisors. Prime."
ReAct
"Reason, then verify with a tool."
→ Thought: "Check divisibility." Action: is_prime(17) → True

Each technique adds more structure — use the simplest one that gets reliable results.

Deep Dive: Execution / Tools

The agent's hands and feet. Tools — external capabilities: APIs, code runners, databases — are what let an agent act beyond generating text.

Agent + Tools Web Search Database Run Code APIs / Email Other Agents

Example: With vs Without Tools

User asks: "What's the weather in Tokyo right now?"

No Tools (LLM alone)
"Tokyo typically has mild weather in June..." (guessing from training data — could be completely wrong today)
With Weather API Tool
Agent calls get_weather("Tokyo") → "Currently 24°C, partly cloudy, 65% humidity." (real-time, accurate)

Tools bridge the gap between what the LLM knows and what's happening now.

Deep Dive: Interaction

How does the outside world talk to the agent, and how does the agent talk back? Three distinct channels. Agent-to-agent communication uses plain APIs or protocols like Agent2Agent (A2A), while Model Context Protocol (MCP) is what connects an agent to its tools and data sources.

Agent receives, processes, responds Input User chat API request Webhook / event Output Text response JSON data Trigger action Agent-to-Agent Agent B Agent C via A2A / APIs

Example: Three Interaction Modes for the Same Agent

A "Meeting Scheduler" agent — same logic, three different interaction surfaces:

Human → Agent
User types in Slack: "Schedule a standup for Monday 9 AM" → Agent books it on Google Calendar
Agent → Human
Agent sends email: "Conflict detected — your 3 PM overlaps with Design Review. Reschedule?"
Agent → Agent
Scheduler agent asks Availability agent via API: "Is room B2 free Monday 9–10?" → Gets JSON response

Interaction design determines who/what can trigger the agent and how results are delivered.

Components in a Workflow

Each component isn't just a feature — it's designed to make the agent function as a step in a larger pipeline.

Previous Step Agent Persona Knowledge Prompting Tools Interaction Next Step

The Workflow Lens

Persona scopes the task. Knowledge gives each step its own context to draw on. Prompting weaves in prior outputs. Tools take real actions. Interaction formats results so the next step can consume them.

The Agent Spectrum

Not all agents are equal. They sit on a spectrum of Large Language Model (LLM) interaction sophistication:

Direct Q in → A out Augmented + docs, examples Dynamic Context + memory, live tools Autonomous plans + executes Simple → → Complex

Direct

LLM role: Responder

Context: Static, within prompt

Fit: Single-step tasks — answer a question, generate text from one instruction

Augmented

LLM role: Guided responder

Context: Pre-augmented (Retrieval-Augmented Generation)

Fit: Specialised steps where domain data shapes the output

Dynamic Context

LLM role: Adaptive processor

Context: Live via memory + tools

Fit: Multi-step loops — fetch, process, summarise in one flow

Autonomous

LLM role: Planner + executor

Context: Dynamic, memory-driven

Fit: End-to-end orchestration — goal in, sub-tasks planned, self-corrects on errors

Workflow Patterns (Preview)

Single agents are building blocks. Real power comes from how you wire them together.

Prompt Chaining Step 1 Step 2 Step 3 Output of each step feeds the next Routing Router Specialist A Specialist B Specialist C Parallelisation Input Agent A Agent B Merge Multiple agents work at the same time Evaluator Worker Evaluator retry if needed Orchestrator Orchestrator Agent A Agent B Agent C Central agent coordinates the whole workflow

Key Takeaways

02 Workflow Modelling →