M
MeshWorld.
AI AI Agents Architecture Design Patterns LLM Autonomous Agents Tool Use ReAct LangChain Claude OpenAI 12 min read

AI Agents Architecture Patterns: A Complete Guide for Developers

Vishnu
By Vishnu

Building AI agents that don’t hallucinate, get stuck in loops, or drain your API budget requires more than plugging an LLM into a chat interface. You need architectural patterns — proven templates for how agents think, plan, use tools, and remember context.

This guide covers the five essential patterns every developer should know when building production-ready AI agents.

:::note[TL;DR]

  • ReAct Pattern: Think → Act → Observe → Repeat. Best for single-step reasoning tasks.
  • Plan-and-Execute: Plan all steps first, then execute. Better for complex multi-step workflows.
  • Multi-Agent Systems: Divide work between specialized agents. Scales to enterprise complexity.
  • Tool Use with Reflection: Let agents call APIs, then verify results before proceeding.
  • Memory Systems: Short-term (context window) + Long-term (vector DB) + Entity (knowledge graph). :::

What Is an AI Agent?

An AI agent is an LLM-powered system that can:

  1. Reason through complex problems
  2. Plan sequences of actions
  3. Use tools (APIs, databases, code execution)
  4. Observe results and adapt
  5. Remember context across interactions

Unlike simple chatbots, agents can take autonomous actions to achieve goals.

The Scenario: Your company needs to automate competitor analysis. A simple prompt won’t work — you need an agent that searches websites, extracts pricing, compares features, and generates a report. That’s where architecture patterns matter.

Pattern 1: ReAct (Reasoning + Acting)

The ReAct pattern alternates between reasoning and action. It’s the simplest effective agent architecture.

How It Works

Thought: I need to check the weather in New York
Action: weather_api(location="New York")
Observation: {"temp": 72, "condition": "sunny"}
Thought: Now I have the weather data. The user asked about outdoor activities.
Action: search_activities(weather="sunny", location="New York")
Observation: ["Central Park", "High Line", "Brooklyn Bridge Walk"]
Final Answer: Based on sunny 72°F weather, I recommend Central Park, the High Line, or a Brooklyn Bridge walk.

Code Implementation

from typing import List, Dict, Any
import json

class ReActAgent:
    def __init__(self, llm_client, tools: Dict[str, callable]):
        self.llm = llm_client
        self.tools = tools
        self.max_iterations = 10
    
    def run(self, query: str) -> str:
        context = f"Query: {query}\n\n"
        
        for i in range(self.max_iterations):
            # Generate thought and action
            prompt = self._build_prompt(context)
            response = self.llm.generate(prompt)
            
            # Parse response
            thought = self._extract_thought(response)
            action = self._extract_action(response)
            
            if not action:
                # Agent provided final answer
                return response
            
            # Execute tool
            tool_name, tool_input = self._parse_action(action)
            if tool_name in self.tools:
                observation = self.tools[tool_name](**tool_input)
                context += f"Thought: {thought}\n"
                context += f"Action: {action}\n"
                context += f"Observation: {observation}\n\n"
            else:
                context += f"Error: Tool '{tool_name}' not found\n\n"
        
        return "Max iterations reached"
    
    def _build_prompt(self, context: str) -> str:
        return f"""You are a helpful assistant. Use the following format:

Thought: [Your reasoning about what to do next]
Action: [Tool name and JSON input, or "Final Answer"]

Available tools:
- weather_api(location: str)
- search_activities(weather: str, location: str)
- calculator(expression: str)

{context}
"""

When to Use ReAct

Use CaseExample
Single-step decisions”Should I bring an umbrella?”
Sequential tool callsSearch → Filter → Summarize
Interactive debuggingFix code errors step by step
Customer supportDiagnose issues through questioning

Limitations

  • No backtracking: Can’t revise earlier decisions
  • Short horizon: Struggles with 10+ step tasks
  • No parallel execution: Steps happen sequentially

Pattern 2: Plan-and-Execute

For complex tasks, plan everything first, then execute. This avoids mid-task dead ends.

How It Works

Plan:
1. Search for competitor pricing data
2. Extract pricing from top 3 results
3. Compare with our pricing
4. Generate analysis report

Execution:
[Execute step 1] → [Execute step 2] → [Execute step 3] → [Execute step 4]

Code Implementation

from dataclasses import dataclass
from typing import List, Optional

@dataclass
class Step:
    description: str
    tool: Optional[str]
    input_params: Dict[str, Any]
    output_var: str

class PlanAndExecuteAgent:
    def __init__(self, llm_client, tools: Dict[str, callable]):
        self.llm = llm_client
        self.tools = tools
    
    def run(self, query: str) -> str:
        # Phase 1: Planning
        plan = self._create_plan(query)
        
        # Phase 2: Execution
        results = {}
        for step in plan:
            if step.tool:
                # Substitute variables from previous steps
                params = self._substitute_vars(step.input_params, results)
                results[step.output_var] = self.tools[step.tool](**params)
            else:
                # LLM reasoning step
                results[step.output_var] = self._llm_reason(step.description, results)
        
        return results.get('final_output', 'Task completed')
    
    def _create_plan(self, query: str) -> List[Step]:
        prompt = f"""Create a step-by-step plan for: {query}

Format each step as:
- Step: [description]
- Tool: [tool_name or "none"]
- Input: [params as JSON]
- Output: [variable_name]

Available tools: {list(self.tools.keys())}
"""
        response = self.llm.generate(prompt)
        return self._parse_plan(response)
    
    def _substitute_vars(self, params: Dict, results: Dict) -> Dict:
        """Replace {{var}} with actual values from previous steps"""
        resolved = {}
        for key, value in params.items():
            if isinstance(value, str) and value.startswith('{{') and value.endswith('}}'):
                var_name = value[2:-2]
                resolved[key] = results.get(var_name, value)
            else:
                resolved[key] = value
        return resolved

When to Use Plan-and-Execute

Use CaseExample
Multi-step workflowsResearch → Draft → Review → Publish
Data pipelinesExtract → Transform → Load → Validate
Report generationGather data → Analyze → Visualize → Write
Code generationPlan architecture → Generate files → Test

Advantages Over ReAct

  • Global optimization: Plans consider all steps upfront
  • Parallel execution: Independent steps run simultaneously
  • Better error recovery: Can replan from any failure point

Pattern 3: Multi-Agent Systems

Divide complex tasks between specialized agents. Each agent has a specific role and expertise.

Architecture Overview

┌─────────────────────────────────────────┐
│         Orchestrator Agent              │
│    (Routes tasks, manages workflow)     │
└─────────────────────────────────────────┘

    ┌──────┼──────┬──────┐
    │      │      │      │
    ▼      ▼      ▼      ▼
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
│Research│ │Writer│ │Coder │ │Review│
│ Agent │ │ Agent│ │ Agent│ │ Agent│
└─────┘ └─────┘ └─────┘ └─────┘

Code Implementation

from typing import Callable
import asyncio

class Agent:
    def __init__(self, name: str, system_prompt: str, tools: List[str]):
        self.name = name
        self.system_prompt = system_prompt
        self.tools = tools
    
    async def execute(self, task: str, context: Dict) -> str:
        # Each agent uses ReAct or Plan-and-Execute internally
        prompt = f"{self.system_prompt}\n\nTask: {task}\nContext: {context}"
        return await self.llm.generate(prompt)

class MultiAgentSystem:
    def __init__(self):
        self.agents: Dict[str, Agent] = {}
        self.orchestrator = None
    
    def register_agent(self, agent: Agent):
        self.agents[agent.name] = agent
    
    async def run(self, query: str) -> str:
        # Orchestrator decides which agents to call
        plan = await self._orchestrate(query)
        
        results = {}
        for step in plan:
            agent_name = step['agent']
            task = step['task']
            
            if agent_name in self.agents:
                agent = self.agents[agent_name]
                results[agent_name] = await agent.execute(task, results)
        
        # Synthesize final output
        return await self._synthesize(query, results)
    
    async def _orchestrate(self, query: str) -> List[Dict]:
        """Determine which agents to use and in what order"""
        orchestrator_prompt = f"""Given this task: {query}

Available agents:
{[f"- {name}: {agent.system_prompt[:100]}..." for name, agent in self.agents.items()]}

Create an execution plan:
1. Which agents to use
2. What task to give each
3. Dependencies between agents

Output as JSON list with 'agent', 'task', and 'depends_on' keys.
"""
        response = await self.llm.generate(orchestrator_prompt)
        return json.loads(response)

# Usage
system = MultiAgentSystem()

system.register_agent(Agent(
    name="researcher",
    system_prompt="You are a research specialist. Find accurate, up-to-date information.",
    tools=["web_search", "academic_search", "news_api"]
))

system.register_agent(Agent(
    name="writer",
    system_prompt="You are a technical writer. Create clear, engaging content.",
    tools=["grammar_check", "readability_score"]
))

system.register_agent(Agent(
    name="coder",
    system_prompt="You are a senior developer. Write clean, tested code.",
    tools=["code_executor", "linter", "test_runner"]
))

result = await system.run("Create a Python script that fetches weather data and sends email alerts")

Agent Specialization Examples

Agent TypeResponsibilityTools
Research AgentInformation gatheringSearch APIs, databases, web scraping
Analysis AgentData processingPandas, SQL, visualization
Code AgentImplementationCode execution, linters, tests
Review AgentQuality assuranceFact-checking, style guides
UI AgentInterface designComponent libraries, design systems

When to Use Multi-Agent

Use CaseWhy Multiple Agents?
Content platformResearch → Write → Edit → SEO optimize
DevOps automationMonitor → Analyze → Plan → Execute
Customer supportTriage → Resolve → Escalate → Follow-up
Research assistantLiterature review → Analysis → Synthesis

Pattern 4: Tool Use with Reflection

Don’t just call tools — verify the results before proceeding.

The Reflection Loop

Plan → Act → Observe → Reflect → [Retry if needed] → Continue

Code Implementation

@dataclass
class ToolResult:
    success: bool
    data: Any
    error: Optional[str] = None

class ReflectiveToolAgent:
    def __init__(self, llm_client, tools: Dict[str, callable]):
        self.llm = llm_client
        self.tools = tools
        self.max_retries = 3
    
    async def use_tool(self, tool_name: str, params: Dict) -> ToolResult:
        for attempt in range(self.max_retries):
            # Execute tool
            try:
                raw_result = await self.tools[tool_name](**params)
                
                # Reflect on result
                reflection = await self._reflect(tool_name, params, raw_result)
                
                if reflection.is_valid:
                    return ToolResult(success=True, data=raw_result)
                
                # Retry with corrections
                params = reflection.corrected_params
                
            except Exception as e:
                if attempt == self.max_retries - 1:
                    return ToolResult(success=False, error=str(e))
        
        return ToolResult(success=False, error="Max retries exceeded")
    
    async def _reflect(self, tool_name: str, params: Dict, result: Any) -> 'Reflection':
        """Analyze if tool output is valid and useful"""
        prompt = f"""Tool: {tool_name}
Input: {json.dumps(params)}
Output: {json.dumps(result)}

Evaluate:
1. Is the output valid and well-formed?
2. Does it contain the expected data?
3. Are there any errors or anomalies?

Respond with JSON:
{{
    "is_valid": true/false,
    "issues": ["list of problems if any"],
    "corrected_params": {{"param": "value"}} // if retry needed
}}
"""
        response = await self.llm.generate(prompt)
        return Reflection.parse(response)

Reflection Checks

CheckExample
Format validationJSON parsing, schema validation
Semantic validation”Does this answer the user’s question?”
Error detectionEmpty results, rate limits, timeouts
Quality assessment”Is this search result relevant?”

Pattern 5: Memory Systems

Agents need to remember context across sessions and learn from past interactions.

Three Types of Memory

┌─────────────────────────────────────────────────────┐
│                   WORKING MEMORY                      │
│         (Current conversation context)                │
│              ~128K tokens (Claude/GPT)                │
└─────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────┐
│                  SHORT-TERM MEMORY                  │
│     (Recent conversations, session history)          │
│         Vector DB: Pinecone, Chroma, Weaviate        │
└─────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────┐
│                   LONG-TERM MEMORY                  │
│    (User preferences, learned facts, entity graph)   │
│        Knowledge Graph + Document Store              │
└─────────────────────────────────────────────────────┘

Implementation

from typing import List
import hashlib

class AgentMemory:
    def __init__(self, vector_store, knowledge_graph):
        self.vector_store = vector_store
        self.kg = knowledge_graph
        self.session_id = None
    
    def start_session(self, user_id: str):
        self.session_id = hashlib.md5(f"{user_id}:{time.time()}".encode()).hexdigest()
    
    async def remember(self, content: str, memory_type: str = "short_term"):
        """Store information for later retrieval"""
        if memory_type == "short_term":
            # Vector embedding for semantic search
            embedding = await self.embed(content)
            await self.vector_store.upsert(
                ids=[f"{self.session_id}:{time.time()}"],
                embeddings=[embedding],
                metadatas=[{"content": content, "session": self.session_id}]
            )
        
        elif memory_type == "long_term":
            # Extract entities and relationships
            entities = await self._extract_entities(content)
            for entity in entities:
                await self.kg.add_entity(entity)
    
    async def recall(self, query: str, k: int = 5) -> List[str]:
        """Retrieve relevant past information"""
        # Semantic search
        query_embedding = await self.embed(query)
        results = await self.vector_store.query(
            query_embeddings=[query_embedding],
            n_results=k,
            filter={"session": self.session_id}
        )
        
        return [r['content'] for r in results['metadatas'][0]]
    
    async def _extract_entities(self, text: str) -> List[Dict]:
        """Use LLM to extract entities and relationships"""
        prompt = f"""Extract entities and relationships from:
{text}

Format: JSON list of {{"entity": "name", "type": "person/place/thing", "relationships": [{{"to": "other", "type": "works_with/located_in/etc"}}]}}"""
        response = await self.llm.generate(prompt)
        return json.loads(response)

Memory Retrieval Strategies

StrategyUse Case
Semantic search”Find similar past conversations”
Entity lookup”What’s the user’s company?”
Temporal recall”What did we discuss last week?”
Structured query”List all API integrations mentioned”

Choosing the Right Pattern

Task ComplexityRecommended Pattern
Simple Q&A with tool useReAct
Multi-step workflowPlan-and-Execute
Cross-functional automationMulti-Agent
Critical operations (finance, health)Tool Use + Reflection
Persistent user relationshipsAny pattern + Memory

Common Pitfalls

The Infinite Loop

# BAD: No iteration limit
while not task_complete:
    agent.step()

# GOOD: Bounded execution
for i in range(max_iterations):
    if task_complete:
        break
    agent.step()

The Context Explosion

# BAD: Unlimited context growth
context += f"Step {i}: {result}\n"

# GOOD: Summarize old context
if len(context) > 100000:
    context = await agent.summarize(context)

The Tool Overload

# BAD: 50 tools confuses the agent
tools = [tool1, tool2, ..., tool50]

# GOOD: Group tools by function
research_tools = [search, scrape, summarize]
code_tools = [execute, lint, test]

Production Checklist

  • Set maximum iteration limits
  • Implement timeout handling
  • Add cost tracking per request
  • Log all tool calls for debugging
  • Cache frequent tool results
  • Implement graceful degradation
  • Add human-in-the-loop for critical decisions
  • Monitor hallucination rates
  • A/B test different prompts
  • Version control your agent configurations

Summary

  • ReAct: Simple, effective for single-step reasoning. Think → Act → Observe.
  • Plan-and-Execute: Complex workflows need upfront planning.
  • Multi-Agent: Scale to enterprise by specializing agents.
  • Tool Use + Reflection: Verify results, don’t blindly trust.
  • Memory: Context across sessions separates toys from tools.

The best agents combine these patterns. Start with ReAct, add planning for complexity, specialize into multi-agent for scale, and always verify critical operations.