What Makes an AI Agent an Agent?

What Is an AI Agent?

An LLM (Large Language Model) — the reasoning engine is the brain. An Agent — a system that perceives, decides, and acts is the whole body. It uses the LLM to understand, reason, and decide — but also acts on those decisions.

The Four Pillars

Perceive
its environment

→

Reason
about observations

→

Plan
to achieve goals

→

Act
execute the plan

These pillars describe what an agent does at runtime. The five components below describe how you build one — Persona and Knowledge shape Perceive/Reason, Prompting drives Plan, Tools enable Act, and Interaction connects the agent to its environment.

Automation vs Agentic

❌ Deterministic Workflow

Fixed input → fixed output, rigid rules, no decision-making.

Efficient but brittle.

→

Welcome email

→

Wait 3d

→

Follow-up

Robotic Process Automation (RPA), drip campaigns, cron jobs — same category.

✅ Agentic Workflow

A dynamic process — adapts, decomposes tasks, routes intelligently.

The agent decides what to do, not just how.

Adapts to new information
Decides its own steps
Handles novel situations

The Core Shift

Using an LLM doesn't make something an agent. What makes it agentic is the move from fixed script to goal-oriented process.

One nuance: this framing assumes you're starting from a deterministic process — something with a clear, repeatable structure (invoice processing, customer triage, report generation). Some tasks — writing code, research, open-ended planning — were never deterministic to begin with. Those are inherently agentic from the start; there's no fixed-script version to automate first. The mistake to avoid in both directions: don't add an agent to something that's genuinely deterministic (unnecessary complexity), and don't build rigid automation for something that's inherently open-ended (it'll break constantly).

Components of an Agent

Deep Dive: Persona

The system prompt that narrows a general-purpose LLM into a focused identity. Same LLM, different persona = completely different agent.

Controls: role, tone, boundaries, output format

Formal Analyst

"You are a senior financial analyst. Use formal language. Always cite data sources. Output in structured tables."

User: "How's AAPL doing?"

Agent: [illustrative] "AAPL is trading at $XXX (+X.X%). Q3 revenue exceeded estimates by ~5%. See table below..."

Friendly Support Bot

"You are a friendly customer support assistant. Use casual language. Keep answers short and helpful. Add emojis."

User: "How's AAPL doing?"

Agent: [illustrative] "Apple's doing great today! Stock is up a bit — not bad at all! Anything else I can help with? 🍎"

Same LLM, same question — completely different responses. That's the power of persona.

Deep Dive: Knowledge

Not just training data — it's everything the agent can access. Four layers, from broad to specific:

LLM Training — broad, static, general
Fine-tuning — domain-specific training
Tools — live data via APIs, search, databases
Memory — short-term (this chat) and long-term (past sessions)

Example: Same Agent, Different Knowledge Layers

User asks: "What's the return policy?"

LLM Training Only
"Most retailers offer 30-day return policies..." (generic, possibly wrong)

+ Tools (Retrieval-Augmented Generation)
"Per our policy doc: 14-day returns, receipt required, electronics are final sale."

Adding the right knowledge layer transforms a guess into a fact.

Deep Dive: Prompting Strategy

The blueprint for how the agent talks to its LLM. It's not just the user's question — it's everything around it.

Prompting Techniques

Zero-shot
Just instructions, no examples

Few-shot
Include examples to guide style

Chain-of-thought
"Think step by step"

ReAct
Reason + Act in loops

Example: Same Question, Four Techniques

Question: "Is 17 a prime number?"

Zero-shot
"Is 17 prime?"
→ "Yes" (no guidance given)

Few-shot
"7→prime, 9→not prime. Is 17 prime?"
→ "Prime" (follows the pattern)

Chain-of-thought
"Think step by step."
→ "17 ÷ 2 = 8.5, ÷ 3 = 5.67... no divisors. Prime."

ReAct
"Reason, then verify with a tool."
→ Thought: "Check divisibility." Action: is_prime(17) → True

Each technique adds more structure — use the simplest one that gets reliable results.

Deep Dive: Execution / Tools

The agent's hands and feet. Tools — external capabilities: APIs, code runners, databases — are what let an agent act beyond generating text.

Example: With vs Without Tools

User asks: "What's the weather in Tokyo right now?"

No Tools (LLM alone)
"Tokyo typically has mild weather in June..." (guessing from training data — could be completely wrong today)

With Weather API Tool
Agent calls get_weather("Tokyo") → "Currently 24°C, partly cloudy, 65% humidity." (real-time, accurate)

Tools bridge the gap between what the LLM knows and what's happening now.

Deep Dive: Interaction

How does the outside world talk to the agent, and how does the agent talk back? Three distinct channels. Agent-to-agent communication uses plain APIs or protocols like Agent2Agent (A2A), while Model Context Protocol (MCP) is what connects an agent to its tools and data sources.

Example: Three Interaction Modes for the Same Agent

A "Meeting Scheduler" agent — same logic, three different interaction surfaces:

Human → Agent
User types in Slack: "Schedule a standup for Monday 9 AM" → Agent books it on Google Calendar

Agent → Human
Agent sends email: "Conflict detected — your 3 PM overlaps with Design Review. Reschedule?"

Agent → Agent
Scheduler agent asks Availability agent via API: "Is room B2 free Monday 9–10?" → Gets JSON response

Interaction design determines who/what can trigger the agent and how results are delivered.

Components in a Workflow

Each component isn't just a feature — it's designed to make the agent function as a step in a larger pipeline.

The Workflow Lens

Persona scopes the task. Knowledge gives each step its own context to draw on. Prompting weaves in prior outputs. Tools take real actions. Interaction formats results so the next step can consume them.

The Agent Spectrum

Not all agents are equal. They sit on a spectrum of Large Language Model (LLM) interaction sophistication:

Direct

LLM role: Responder

Context: Static, within prompt

Fit: Single-step tasks — answer a question, generate text from one instruction

Augmented

LLM role: Guided responder

Context: Pre-augmented (Retrieval-Augmented Generation)

Fit: Specialised steps where domain data shapes the output

Dynamic Context

LLM role: Adaptive processor

Context: Live via memory + tools

Fit: Multi-step loops — fetch, process, summarise in one flow

Autonomous

LLM role: Planner + executor

Context: Dynamic, memory-driven

Fit: End-to-end orchestration — goal in, sub-tasks planned, self-corrects on errors

Workflow Patterns (Preview)

Single agents are building blocks. Real power comes from how you wire them together.

Key Takeaways

An agent = LLM + Perceive + Reason + Plan + Act. The LLM is necessary but not sufficient.
The five components (Persona, Knowledge, Prompting, Tools, Interaction) determine what the agent is and what it can do.
Agentic ≠ just using an LLM. The shift is from fixed scripts to goal-oriented processes.
Choose the spectrum level that matches your task — not every problem needs an autonomous agent.
Workflow patterns (chaining, routing, parallelisation, evaluator, orchestrator) are how you compose agents into systems.