Everything we’ve covered so far—calling LLMs, building RAG systems, using frameworks—follows a pattern: human asks, AI responds. One turn, done.
Agentic AI breaks that pattern. Instead of responding once, agents can plan multi-step approaches, use tools, reflect on their progress, and autonomously work toward goals. This is the frontier of practical AI in 2025.
I’ll be honest: this space is evolving fast, and there’s a lot of hype. Let me separate what actually works from what’s still research.
Series Navigation: Part 1: GenAI Intro → Part 2: LLMs → Part 3: Frameworks → Part 4: Agentic AI (You are here) → Part 5: Building Agents → Part 6: Enterprise
What Makes an AI “Agentic”?
An AI agent has these characteristics:
- Goal-oriented: Given a goal, not just a question
- Autonomous planning: Breaks goals into steps without explicit instructions
- Tool use: Can execute actions—search, code, API calls, file operations
- Observation and adaptation: Sees results and adjusts approach
- Memory: Maintains context across multiple steps
Simple chain: “Summarize this document” → [Summary]
Agent: “Research competitor pricing and create a comparison report” → [Plans steps] → [Searches web] → [Extracts data] → [Compares] → [Generates report]
The Agent Architecture
Most agent systems follow a similar pattern:
┌─────────────────────────────────────────────────────────────────┐
│ AGENT LOOP │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ THINK │ → │ PLAN │ → │ ACT │ → │ OBSERVE │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ ↑ │ │
│ └────────────────────────────────────────────┘ │
│ (Loop until done) │
└─────────────────────────────────────────────────────────────────┘
│
↓
┌─────────────────────────────────────────────────────────┐
│ TOOLS │
│ [Search] [Code Exec] [APIs] [Files] [Database] [...] │
└─────────────────────────────────────────────────────────┘
The ReAct Pattern: Reasoning + Acting
ReAct (Reasoning and Acting) is the foundational agent pattern. The model alternates between thinking about what to do and taking actions.
# ReAct pattern - manual implementation
from openai import OpenAI
import json
client = OpenAI()
# Define available tools
tools = [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Perform mathematical calculations",
"parameters": {
"type": "object",
"properties": {
"expression": {"type": "string", "description": "Math expression"}
},
"required": ["expression"]
}
}
}
]
def search_web(query: str) -> str:
# In production, call actual search API
return f"Search results for '{query}': [Simulated results]"
def calculate(expression: str) -> str:
try:
result = eval(expression) # Use safer evaluation in production
return str(result)
except Exception as e:
return f"Error: {e}"
def execute_tool(name: str, args: dict) -> str:
"""Execute a tool and return result."""
if name == "search_web":
return search_web(args["query"])
elif name == "calculate":
return calculate(args["expression"])
return "Unknown tool"
def run_agent(goal: str, max_iterations: int = 10):
"""Run a ReAct agent loop."""
messages = [
{
"role": "system",
"content": """You are a helpful assistant that can use tools to accomplish goals.
Think step by step. Use tools when needed. State your reasoning."""
},
{"role": "user", "content": goal}
]
for i in range(max_iterations):
print(f"\n--- Iteration {i+1} ---")
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
message = response.choices[0].message
# Check if done (no tool calls)
if not message.tool_calls:
print(f"Agent response: {message.content}")
return message.content
# Execute tool calls
messages.append(message)
for tool_call in message.tool_calls:
func_name = tool_call.function.name
func_args = json.loads(tool_call.function.arguments)
print(f"Calling: {func_name}({func_args})")
result = execute_tool(func_name, func_args)
print(f"Result: {result}")
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
return "Max iterations reached"
# Example
run_agent("What is 15% of the current US minimum wage times 40 hours?")
Agent Architectures That Work
1. Single Agent with Tools
The simplest pattern. One LLM with access to tools.
Good for: Focused tasks—research, data analysis, coding assistants
Limitations: Can get stuck, may not handle very complex multi-domain tasks
2. Multi-Agent Systems
Multiple specialized agents that collaborate.
# Multi-agent pattern - specialist agents
class ResearchAgent:
"""Agent specialized in information gathering."""
system_prompt = """You are a research specialist.
Search for information and summarize findings.
Be thorough but concise."""
tools = [search_tool, scrape_tool]
class AnalystAgent:
"""Agent specialized in data analysis."""
system_prompt = """You are a data analyst.
Analyze data, find patterns, create insights.
Be quantitative and precise."""
tools = [calculate_tool, chart_tool]
class WriterAgent:
"""Agent specialized in content creation."""
system_prompt = """You are a technical writer.
Create clear, engaging content from research and analysis.
Structure information logically."""
tools = [format_tool]
class OrchestratorAgent:
"""Coordinates other agents."""
def execute_task(self, goal: str):
# 1. Research phase
research_results = self.research_agent.run(
f"Research: {goal}"
)
# 2. Analysis phase
analysis = self.analyst_agent.run(
f"Analyze this research: {research_results}"
)
# 3. Writing phase
final_output = self.writer_agent.run(
f"Create report from: {analysis}"
)
return final_output
3. Hierarchical Agents
A manager agent delegates to worker agents.
┌──────────────────┐
│ Manager Agent │
│ (Plans & Routes)│
└────────┬─────────┘
│
┌──────────────────┼──────────────────┐
↓ ↓ ↓
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Worker 1 │ │ Worker 2 │ │ Worker 3 │
│ (Research)│ │ (Coding) │ │ (Writing) │
└───────────┘ └───────────┘ └───────────┘
Planning Strategies
Plan-and-Execute
# Plan-and-Execute: Create full plan first, then execute
def plan_and_execute(goal: str):
"""First plan all steps, then execute them."""
# Planning phase
plan_response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": """Create a detailed plan to accomplish the goal.
Return as JSON: {"steps": ["step1", "step2", ...]}"""
},
{"role": "user", "content": f"Goal: {goal}"}
],
response_format={"type": "json_object"}
)
plan = json.loads(plan_response.choices[0].message.content)
print(f"Plan: {plan['steps']}")
# Execution phase
results = []
for step in plan["steps"]:
result = execute_step(step, results)
results.append({"step": step, "result": result})
print(f"Completed: {step}")
return results
ReWOO: Reasoning Without Observation
# ReWOO - Plan with placeholders, execute in batch
def rewoo_agent(goal: str):
"""
ReWOO pattern:
1. Create full plan with variable placeholders
2. Execute all steps
3. Substitute results
"""
plan_prompt = f"""
Goal: {goal}
Create a plan using these tools: search, calculate, summarize
Use #E1, #E2, etc. as placeholders for results.
Format:
Plan: [description]
#E1 = tool[input]
#E2 = tool[input using #E1 if needed]
...
"""
# Get plan with placeholders
plan = get_plan(plan_prompt)
# Execute all tool calls
evidence = {}
for step in plan.steps:
# Replace any placeholders with actual evidence
resolved_input = resolve_placeholders(step.input, evidence)
result = execute_tool(step.tool, resolved_input)
evidence[step.var] = result
# Final synthesis with all evidence
return synthesize(goal, evidence)
Tool Design Principles
Good tools make good agents. Here’s what I’ve learned:
Tool Design Best Practices
- Single responsibility: One tool, one job. “search_web” not “search_and_summarize”
- Clear descriptions: Models choose tools based on descriptions. Be precise.
- Structured outputs: Return structured data the model can parse
- Error handling: Return meaningful errors, not stack traces
- Idempotency: Same input → same output (where possible)
- Bounds: Limit what tools can do. Don’t give agents unlimited filesystem access.
# Well-designed tool example
def create_tool_definition():
return {
"type": "function",
"function": {
"name": "query_database",
"description": """Execute a read-only SQL query against the analytics database.
Only SELECT queries are allowed. Returns up to 100 rows.
Tables: users, orders, products, events.""",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "SQL SELECT query"
},
"limit": {
"type": "integer",
"description": "Max rows to return (default 100, max 1000)"
}
},
"required": ["query"]
}
}
}
def query_database(query: str, limit: int = 100) -> dict:
"""Execute query with safety checks."""
# Safety checks
query_upper = query.upper().strip()
if not query_upper.startswith("SELECT"):
return {"error": "Only SELECT queries allowed", "success": False}
forbidden = ["DROP", "DELETE", "UPDATE", "INSERT", "ALTER"]
for word in forbidden:
if word in query_upper:
return {"error": f"{word} not allowed", "success": False}
# Execute with limits
try:
limit = min(limit, 1000)
results = db.execute(f"{query} LIMIT {limit}")
return {
"success": True,
"row_count": len(results),
"data": results
}
except Exception as e:
return {"error": str(e), "success": False}
Memory Systems
Agents need memory to work on complex tasks over time.
| Memory Type | Purpose | Implementation |
|---|---|---|
| Working Memory | Current task context | Conversation history in context window |
| Short-term | Recent interactions | Sliding window, summarization |
| Long-term | Persistent knowledge | Vector database, knowledge graph |
| Episodic | Past experiences | Stored successful task traces |
# Memory-augmented agent
class MemoryAgent:
def __init__(self):
self.working_memory = [] # Current conversation
self.vector_store = ChromaDB() # Long-term memory
def remember(self, content: str, metadata: dict):
"""Store in long-term memory."""
self.vector_store.add(content, metadata)
def recall(self, query: str, k: int = 5) -> list:
"""Retrieve relevant memories."""
return self.vector_store.search(query, k)
def run(self, user_input: str):
# 1. Recall relevant memories
memories = self.recall(user_input)
memory_context = "\n".join([m.content for m in memories])
# 2. Build context with memories
messages = [
{"role": "system", "content": f"Relevant context:\n{memory_context}"},
*self.working_memory,
{"role": "user", "content": user_input}
]
# 3. Generate response
response = self.llm.generate(messages)
# 4. Update memories
self.working_memory.append({"role": "user", "content": user_input})
self.working_memory.append({"role": "assistant", "content": response})
# 5. Optionally persist important info
if self.is_important(response):
self.remember(response, {"type": "interaction", "timestamp": now()})
return response
Current Limitations (Honest Assessment)
Where Agents Still Struggle
- Long-horizon tasks: Error accumulates. 10+ step plans often fail.
- Recovery: When stuck, agents often loop rather than backtrack.
- Cost: Agent loops can be expensive—many LLM calls per task.
- Latency: Sequential tool use means slow overall execution.
- Reliability: Non-deterministic. Same task may succeed or fail.
- Safety: Autonomous actions require careful sandboxing.
What Actually Works in Production
Based on what I’ve deployed:
- Constrained agents: Limited tool sets, focused domains, clear boundaries
- Human-in-the-loop: Agent proposes, human approves critical actions
- Hybrid approaches: Deterministic workflows with agentic components
- Short chains: 3-5 steps max without human checkpoint
- Specialized agents: One agent per domain, not one agent for everything
Key Takeaways
- Agentic AI adds autonomy—agents plan, act, and adapt
- ReAct is the foundational pattern: reason, act, observe, repeat
- Tool design is crucial—clear, bounded, well-described tools
- Memory enables complex, multi-session tasks
- Start constrained—expand agent capabilities gradually
- Expect failures—build in human oversight for critical actions
What’s Next
In Part 5, we’ll get hands-on and build a complete AI agent—a code review assistant that can analyze PRs, suggest improvements, and explain its reasoning. Real code, real patterns.
References & Further Reading
- ReAct Paper – ReAct: Synergizing Reasoning and Acting in LLMs
- AutoGPT – github.com/AutoGPT
- OpenAI Function Calling – platform.openai.com
- LangChain Agents – python.langchain.com
- The Landscape of Emerging AI Agent Architectures – arxiv.org
Building agents? I’d love to hear about your architecture choices. Connect on GitHub or comment below.
Discover more from Code, Cloud & Context
Subscribe to get the latest posts sent to your email.