Visual Guide to LLM Agents - Hands-On Large Language Models

Overview

LLM Agents represent the evolution from language models that generate text to autonomous systems that can perceive, reason, plan, and take actions in the world. By combining LLMs with tool use, memory, planning capabilities, and feedback loops, agents can accomplish complex multi-step tasks that require sustained reasoning and interaction with external systems.

This guide is part of the bonus material for Hands-On Large Language Models. It explores how the foundational models covered in the book can be transformed into autonomous agents.

What Makes an Agent?

Traditional LLMs:

User Input → LLM → Text Output

LLM Agents:

Task Goal → [Planning] → [Action Selection] → [Tool Use] → [Observation] → [Reasoning] → ...
              ↓            ↓                    ↓             ↓             ↓
           Memory      Environment        Tools/APIs      Feedback      Next Action

The key differences:

Autonomy: Agents decide on actions without per-step human guidance
Tool use: Can interact with external systems and APIs
Memory: Maintain context across multiple interactions
Planning: Break down complex goals into actionable steps
Iteration: Learn from feedback and adjust approach

Why Agents Matter

Agents unlock new capabilities and applications:

Task Automation: Complete multi-step workflows autonomously
Tool Integration: Seamlessly use calculators, databases, APIs, and more
Complex Problem Solving: Handle tasks requiring planning and iteration
Real-World Interaction: Interface with software, web services, and systems
Continuous Improvement: Learn from experience and feedback

What You’ll Learn

The visual guide explains LLM agents through detailed illustrations:

Agent Architectures

ReAct, ReWOO, AutoGPT, and other agent frameworks and design patterns

Tool Use

How agents discover, select, and execute tools to accomplish tasks

Planning Strategies

Different approaches to breaking down goals and sequencing actions

Memory Systems

Short-term and long-term memory for maintaining context and learning

Visual Guide

A Visual Guide to LLM Agents

Read the full visual guide with detailed diagrams showing how LLM agents work, from basic tool use to complex multi-agent systems.

Agent systems build on multiple concepts from the book:

Chapter 5: Text Generation - Core LLM capabilities used by agents
Chapter 6: Prompt Engineering - Prompting strategies for agent behavior
Chapter 7: Advanced Text Generation - Techniques for reliable agent outputs
Chapter 9: Deploying LLMs - Infrastructure for agent systems

Core Components

1. The Agent Loop

The fundamental pattern of agent execution:

Observe: Get current state and feedback
Think: Reason about situation and goal
Plan: Decide on next action
Act: Execute action using tools
Repeat: Continue until goal achieved

This loop enables:

Adaptive behavior based on feedback
Error recovery and replanning
Multi-step task completion

2. Tool Use (Function Calling)

Agents extend LLM capabilities through tools: Types of Tools

Information retrieval: Search engines, databases, APIs
Computation: Calculators, code execution, data analysis
Communication: Email, messaging, notifications
Actions: File operations, API calls, system commands

Tool Selection Process

Agent receives task
Identifies relevant tools from available set
Constructs appropriate tool call with parameters
Executes tool and receives result
Incorporates result into reasoning

3. Memory Systems

Short-term Memory

Current conversation context
Recent actions and observations
Active goal and sub-goals
Working information

Long-term Memory

Past experiences and outcomes
Learned strategies and patterns
User preferences and history
Domain knowledge

Memory Techniques

Vector databases for semantic retrieval
Summarization for context compression
Hierarchical organization
Relevance-based retrieval

4. Planning

Approaches to Planning Single-path Planning

Linear sequence of steps
Fast and efficient
Works for straightforward tasks

Tree Search

Explore multiple possible paths
Backtrack from failures
More robust but slower

Hierarchical Planning

Break into sub-goals
Plan at multiple levels of abstraction
Handle complex tasks

Reactive Planning

Minimal upfront planning
React to situations as they arise
More flexible but potentially inefficient

Agent Architectures

ReAct (Reasoning + Acting)

Interleaves reasoning and action:

Thought: I need to find the population of Tokyo
Action: search("Tokyo population")
Observation: 13.96 million (2021)
Thought: Now I need to compare with New York
Action: search("New York City population")
Observation: 8.34 million (2021)
Thought: Tokyo is larger. I can now answer.
Answer: Tokyo has a larger population than New York City.

Benefits

Explicit reasoning traces
Clear decision-making process
Easy to debug and understand

AutoGPT Pattern

Fully autonomous goal-directed agent:

Receives high-level goal
Breaks into sub-tasks automatically
Executes and monitors progress
Self-corrects and adapts

Characteristics

Minimal human intervention
Persistent execution
Goal-driven behavior

BabyAGI Pattern

Task management focused:

Creates and prioritizes task lists
Executes tasks sequentially
Creates new tasks based on results
Maintains task context

LangChain Agents

Framework-based approach:

Pre-built agent types
Tool ecosystem
Memory management
Easy customization

The most effective agents combine multiple patterns - using ReAct for transparency, hierarchical planning for complex tasks, and memory systems for learning and adaptation.

Tool Integration

Function Calling

Modern LLMs support structured function calling:

{
  "name": "search_web",
  "parameters": {
    "query": "latest AI research papers",
    "num_results": 5
  }
}

Advantages

Structured and reliable
Type-safe parameters
Clear tool capabilities
Easy to extend

Tool Discovery

Agents must know what tools are available:

Static declaration: Tools defined upfront
Dynamic discovery: Query tool registries
Learning: Discover through experience

Tool Composition

Combining multiple tools for complex tasks:

Sequential: Output of one tool feeds to next
Parallel: Execute multiple tools simultaneously
Conditional: Choose tools based on conditions

Advanced Capabilities

Multi-Agent Systems

Multiple agents working together:

Specialized agents: Each handles specific tasks
Coordination: Agents communicate and synchronize
Emergent behavior: Complex outcomes from simple agents
Robustness: System continues if one agent fails

Self-Improvement

Agents that learn and improve:

Experience replay: Learn from past interactions
Strategy optimization: Refine approaches based on outcomes
Tool learning: Discover new tool combinations
Preference learning: Adapt to user feedback

Human-in-the-Loop

Balancing autonomy with human oversight:

Approval points: Human confirms critical actions
Feedback integration: Learn from human corrections
Collaborative decision-making: Human and agent work together
Safety mechanisms: Humans can intervene when needed

Practical Applications

Software Development

Code generation and debugging
Automated testing
Documentation writing
Dependency management

Research and Analysis

Literature review
Data analysis workflows
Report generation
Citation management

Business Operations

Customer support automation
Data entry and processing
Report generation
Workflow automation

Personal Assistance

Email management
Schedule coordination
Information gathering
Task automation

Implementation Best Practices

1. Start Simple

Begin with single-step tool use
Add complexity gradually
Test thoroughly at each stage

2. Design for Observability

Log all actions and reasoning
Make decision process transparent
Enable debugging and analysis

3. Implement Safety Measures

Validate tool outputs
Set resource limits
Require approval for sensitive actions
Implement rollback mechanisms

4. Handle Failures Gracefully

Expect tool failures
Implement retry logic
Provide fallback strategies
Learn from errors

5. Optimize for Cost

Cache frequent queries
Batch operations when possible
Use smaller models for simple decisions
Monitor token usage

Challenges and Limitations

Current Challenges

Reliability: Agents can make mistakes or get stuck
Cost: Multiple LLM calls can be expensive
Latency: Multi-step reasoning takes time
Safety: Autonomous actions need careful control
Debugging: Complex behavior can be hard to trace

Active Research Areas

Improving planning efficiency
Better error recovery mechanisms
Formal verification of agent behavior
Multi-agent coordination
Learning from human feedback

Tools and Frameworks

LangChain

Comprehensive agent framework
Large tool ecosystem
Memory management
Active community

AutoGPT

Autonomous task execution
Goal-driven architecture
Plugin system
Web UI

BabyAGI

Task-driven autonomous agent
Simple and hackable
Focused on task management

Microsoft Semantic Kernel

Enterprise-focused framework
.NET and Python support
Plugin architecture
Integration with Microsoft services

Haystack

Focus on retrieval-augmented agents
Strong NLP pipeline
Production-ready
Flexible architecture

Additional Resources

ReAct Paper - Reasoning and Acting in LLMs
Toolformer - Teaching LMs to use tools
AutoGPT Repository - Popular agent implementation
LangChain Docs - Agent frameworks
HuggingGPT - LLM as controller

The Future of Agents

Agent technology is rapidly evolving: Near-term

More reliable tool use
Better planning algorithms
Improved error recovery
Multi-modal agents (vision + language)

Medium-term

Multi-agent collaboration
Learning from experience
Formal verification
Domain-specific agents

Long-term

General-purpose autonomous assistants
Self-improving systems
Complex real-world task completion
Human-AI collaborative workflows

Reasoning LLMs

The reasoning capabilities that enable agent planning

DeepSeek-R1

A powerful reasoning model for agent systems

Conclusion

LLM Agents represent a paradigm shift from static text generation to dynamic, goal-directed systems that can autonomously accomplish complex tasks. By combining language understanding with tool use, planning, and memory, agents extend the utility of LLMs far beyond their original design. As the technology matures, agents will become increasingly capable, reliable, and integrated into our workflows and applications.

Visual Guides

Additional Content

Documentation Index

​Overview

​What Makes an Agent?

​Why Agents Matter

​What You’ll Learn

Agent Architectures

Tool Use

Planning Strategies

Memory Systems

​Visual Guide

A Visual Guide to LLM Agents

​Related Book Chapters

​Core Components

​1. The Agent Loop

​2. Tool Use (Function Calling)

​3. Memory Systems

​4. Planning

​Agent Architectures

​ReAct (Reasoning + Acting)

​AutoGPT Pattern

​BabyAGI Pattern

​LangChain Agents

​Tool Integration

​Function Calling

​Tool Discovery

​Tool Composition

​Advanced Capabilities

​Multi-Agent Systems

​Self-Improvement

​Human-in-the-Loop

​Practical Applications

​Software Development

​Research and Analysis

​Business Operations

​Personal Assistance

​Implementation Best Practices

​1. Start Simple

​2. Design for Observability

​3. Implement Safety Measures

​4. Handle Failures Gracefully

​5. Optimize for Cost

​Challenges and Limitations

​Current Challenges

​Active Research Areas

​Tools and Frameworks

​LangChain

​AutoGPT

​BabyAGI

​Microsoft Semantic Kernel

​Haystack

​Additional Resources

​The Future of Agents

Reasoning LLMs

DeepSeek-R1

​Conclusion

Build docs developers (and LLMs) love

Overview

What Makes an Agent?

Why Agents Matter

What You’ll Learn

Visual Guide

Related Book Chapters

Core Components

1. The Agent Loop

2. Tool Use (Function Calling)

3. Memory Systems

4. Planning

Agent Architectures

ReAct (Reasoning + Acting)

AutoGPT Pattern

BabyAGI Pattern

LangChain Agents

Tool Integration

Function Calling

Tool Discovery

Tool Composition

Advanced Capabilities

Multi-Agent Systems

Self-Improvement

Human-in-the-Loop

Practical Applications

Software Development

Research and Analysis

Business Operations

Personal Assistance

Implementation Best Practices

1. Start Simple

2. Design for Observability

3. Implement Safety Measures

4. Handle Failures Gracefully

5. Optimize for Cost

Challenges and Limitations

Current Challenges

Active Research Areas

Tools and Frameworks

LangChain

AutoGPT

BabyAGI

Microsoft Semantic Kernel

Haystack

Additional Resources

The Future of Agents

Conclusion