Building Multi-Agent Systems with LangGraph: A Practical Guide

Introduction

Multi-agent systems represent the future of AI automation, where specialized agents collaborate to solve complex problems that single agents cannot handle alone. While frameworks like LangChain enable basic agent creation, LangGraph takes multi-agent orchestration to the next level with its graph-based approach to workflow management.

In this comprehensive guide, you’ll learn how to build production-ready multi-agent systems using LangGraph. We’ll cover everything from basic concepts to advanced patterns, with practical code examples you can use in your projects.

What you’ll learn:
– LangGraph fundamentals and architecture
– Building your first multi-agent system
– Advanced coordination patterns
– State management across agents
– Real-world implementation strategies
– Best practices and common pitfalls

What is LangGraph?

LangGraph is a library built on top of LangChain that enables you to create stateful, multi-agent workflows using a graph-based architecture. Think of it as a way to orchestrate multiple AI agents where each node in the graph represents an agent or a step, and edges define how information flows between them.

Why LangGraph for Multi-Agent Systems?

Traditional approach problems:
– Linear, sequential agent execution
– Difficult to implement conditional logic
– Hard to manage state across multiple agents
– No built-in support for cyclic workflows
– Complex error handling and recovery

LangGraph solutions:
– Graph-based architecture: Model complex workflows visually
– Stateful execution: Maintain context across agent interactions
– Conditional routing: Dynamic decision-making between agents
– Cyclic workflows: Support for iterative refinement
– Built-in persistence: Save and resume workflows
– Human-in-the-loop: Easy integration of human oversight

LangGraph Architecture Fundamentals

Core Concepts

1. State
The shared data structure that flows through your graph. All agents read from and write to this state.

from typing import TypedDict, List

class AgentState(TypedDict):
    messages: List[str]
    current_task: str
    results: dict
    next_agent: str

2. Nodes
Functions or agents that perform specific tasks. Each node receives the current state and returns an updated state.

def research_agent(state: AgentState) -> AgentState:
    """Research agent that gathers information"""
    # Perform research
    research_results = perform_research(state['current_task'])

# Update state
    state['results']['research'] = research_results
    state['messages'].append(f"Research completed: {research_results}")

return state

3. Edges
Connections between nodes that determine workflow. Can be:
– Normal edges: Always proceed to next node
– Conditional edges: Route based on state
– Entry point: Where the graph starts
– End point: Where the graph terminates

4. Graph
The overall structure that connects nodes and edges.

from langgraph.graph import StateGraph, END

Create graph
workflow = StateGraph(AgentState)

Add nodes
workflow.add_node("researcher", research_agent)
workflow.add_node("writer", writer_agent)
workflow.add_node("reviewer", reviewer_agent)

Add edges
workflow.add_edge("researcher", "writer")
workflow.add_edge("writer", "reviewer")
workflow.add_edge("reviewer", END)

Set entry point
workflow.set_entry_point("researcher")

Compile
app = workflow.compile()

Setting Up LangGraph

Installation

Install LangGraph and dependencies
pip install langgraph langchain langchain-openai

For other LLM providers
pip install langchain-anthropic  # For Claude
pip install langchain-google-genai  # For Gemini

Basic Configuration

import os
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

Set API keys
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"

Initialize LLM
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
Or use Claude
llm = ChatAnthropic(model="claude-3-sonnet-20240229")

Building Your First Multi-Agent System

Let’s build a simple content creation system with three specialized agents:

Example: Blog Post Creation Team

Agents:
1. Researcher: Gathers information on the topic
2. Writer: Creates the blog post content
3. Editor: Reviews and improves the content

Step 1: Define the State

from typing import TypedDict, List, Annotated
from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage, AIMessage

class BlogState(TypedDict):
    topic: str
    research_notes: str
    draft_content: str
    final_content: str
    messages: Annotated[List[HumanMessage | AIMessage], "The messages in the conversation"]
    revision_count: int

Step 2: Create Agent Nodes

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4", temperature=0.7)

def researcher_agent(state: BlogState) -> BlogState:
    """Research agent that gathers information"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a research assistant. Gather key information about the topic."),
        ("human", "Research the following topic and provide key points: {topic}")
    ])

chain = prompt | llm
    response = chain.invoke({"topic": state["topic"]})

state["research_notes"] = response.content
    state["messages"].append(AIMessage(content=f"Research completed: {response.content[:100]}..."))

return state

def writer_agent(state: BlogState) -> BlogState:
    """Writer agent that creates content"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a professional blog writer. Create engaging content based on research."),
        ("human", "Topic: {topic}\n\nResearch Notes:\n{research_notes}\n\nWrite a comprehensive blog post.")
    ])

chain = prompt | llm
    response = chain.invoke({
        "topic": state["topic"],
        "research_notes": state["research_notes"]
    })

state["draft_content"] = response.content
    state["messages"].append(AIMessage(content="Draft created"))

return state

def editor_agent(state: BlogState) -> BlogState:
    """Editor agent that reviews and improves content"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are an editor. Review and improve the content for clarity and engagement."),
        ("human", "Review and edit this draft:\n\n{draft}")
    ])

chain = prompt | llm
    response = chain.invoke({"draft": state["draft_content"]})

state["final_content"] = response.content
    state["messages"].append(AIMessage(content="Editing completed"))
    state["revision_count"] = state.get("revision_count", 0) + 1

return state

Step 3: Build the Graph

Create the graph
workflow = StateGraph(BlogState)

Add nodes
workflow.add_node("researcher", researcher_agent)
workflow.add_node("writer", writer_agent)
workflow.add_node("editor", editor_agent)

Define the workflow
workflow.add_edge("researcher", "writer")
workflow.add_edge("writer", "editor")
workflow.add_edge("editor", END)

Set entry point
workflow.set_entry_point("researcher")

Compile the graph
app = workflow.compile()

Step 4: Run the Multi-Agent System

Initialize state
initial_state = {
    "topic": "The Future of AI Agents in 2025",
    "research_notes": "",
    "draft_content": "",
    "final_content": "",
    "messages": [],
    "revision_count": 0
}

Execute the workflow
result = app.invoke(initial_state)

Access results
print("Final Content:")
print(result["final_content"])

print("\nMessages:")
for msg in result["messages"]:
    print(f"- {msg.content}")

Advanced Multi-Agent Patterns

Pattern 1: Conditional Routing

Route to different agents based on state conditions:

def should_revise(state: BlogState) -> str:
    """Decide if content needs revision"""
    if state["revision_count"] < 2:
        # Check content quality (simplified)
        if len(state["final_content"]) < 1000:
            return "writer"  # Needs more content
    return "end"

Add conditional edge
workflow.add_conditional_edges(
    "editor",
    should_revise,
    {
        "writer": "writer",  # Go back to writer
        "end": END  # Finish
    }
)

Pattern 2: Parallel Agent Execution

Run multiple agents concurrently:

from langgraph.graph import Graph

def parallel_research(state: BlogState) -> BlogState:
    """Coordinator that triggers parallel research"""
    return state

def technical_researcher(state: BlogState) -> BlogState:
    """Researches technical aspects"""
    # Research technical details
    state["technical_research"] = "Technical findings..."
    return state

def market_researcher(state: BlogState) -> BlogState:
    """Researches market trends"""
    # Research market trends
    state["market_research"] = "Market findings..."
    return state

def synthesizer(state: BlogState) -> BlogState:
    """Combines research from multiple agents"""
    combined = f"{state['technical_research']}\n{state['market_research']}"
    state["research_notes"] = combined
    return state

Build parallel workflow
workflow = StateGraph(BlogState)

workflow.add_node("coordinator", parallel_research)
workflow.add_node("tech_research", technical_researcher)
workflow.add_node("market_research", market_researcher)
workflow.add_node("synthesizer", synthesizer)

Parallel execution
workflow.add_edge("coordinator", "tech_research")
workflow.add_edge("coordinator", "market_research")
workflow.add_edge("tech_research", "synthesizer")
workflow.add_edge("market_research", "synthesizer")

workflow.set_entry_point("coordinator")

Pattern 3: Hierarchical Multi-Agent System

Manager agent that delegates to worker agents:

def manager_agent(state: BlogState) -> BlogState:
    """Manager that assigns tasks to workers"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a project manager. Analyze the task and decide which specialist to assign."),
        ("human", "Task: {task}\n\nAvailable specialists: research, writing, data_analysis\n\nWhich specialist should handle this?")
    ])

chain = prompt | llm
    response = chain.invoke({"task": state["current_task"]})

state["assigned_agent"] = response.content.lower()
    return state

def route_to_specialist(state: BlogState) -> str:
    """Route based on manager's decision"""
    return state["assigned_agent"]

Add conditional routing based on manager's decision
workflow.add_conditional_edges(
    "manager",
    route_to_specialist,
    {
        "research": "research_specialist",
        "writing": "writing_specialist",
        "data_analysis": "data_specialist"
    }
)

State Management in Multi-Agent Systems

State Persistence

Save and resume workflows:

from langgraph.checkpoint.sqlite import SqliteSaver

Create checkpointer
memory = SqliteSaver.from_conn_string(":memory:")

Compile with checkpointing
app = workflow.compile(checkpointer=memory)

Run with thread ID for persistence
config = {"configurable": {"thread_id": "blog-post-123"}}
result = app.invoke(initial_state, config=config)

Resume later with same thread_id
continued_result = app.invoke(
    {"topic": "Continue working on this"},
    config=config
)

Streaming State Updates

Get real-time updates as agents work:

Stream the execution
for event in app.stream(initial_state):
    for node_name, node_state in event.items():
        print(f"\n--- {node_name} ---")
        print(f"Messages: {node_state.get('messages', [])[-1]}")

State Reducers

Combine updates from multiple agents:

from operator import add
from typing import Annotated

class CollaborativeState(TypedDict):
    # Use operator.add to combine lists from multiple agents
    suggestions: Annotated[List[str], add]
    # Regular fields are replaced
    final_decision: str

Multiple agents can add to suggestions simultaneously
def agent_1(state: CollaborativeState) -> CollaborativeState:
    return {"suggestions": ["Suggestion from Agent 1"]}

def agent_2(state: CollaborativeState) -> CollaborativeState:
    return {"suggestions": ["Suggestion from Agent 2"]}

Both suggestions will be combined in final state

Real-World Use Case: Research Assistant Team

Let’s build a practical multi-agent system for comprehensive research:

from typing import List, TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

class ResearchState(TypedDict):
    query: str
    web_results: List[str]
    paper_summaries: List[str]
    code_examples: List[str]
    final_report: str

llm = ChatOpenAI(model="gpt-4")

def web_researcher(state: ResearchState) -> ResearchState:
    """Searches the web for relevant information"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a web research specialist. Find relevant online resources."),
        ("human", "Search for information about: {query}")
    ])

# In production, integrate with actual search APIs
    chain = prompt | llm
    results = chain.invoke({"query": state["query"]})

state["web_results"] = [results.content]
    return state

def academic_researcher(state: ResearchState) -> ResearchState:
    """Finds and summarizes academic papers"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are an academic researcher. Find and summarize relevant research papers."),
        ("human", "Find academic papers on: {query}")
    ])

chain = prompt | llm
    results = chain.invoke({"query": state["query"]})

state["paper_summaries"] = [results.content]
    return state

def code_researcher(state: ResearchState) -> ResearchState:
    """Finds code examples and implementations"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a code researcher. Find relevant code examples and implementations."),
        ("human", "Find code examples for: {query}")
    ])

chain = prompt | llm
    results = chain.invoke({"query": state["query"]})

state["code_examples"] = [results.content]
    return state

def report_synthesizer(state: ResearchState) -> ResearchState:
    """Combines all research into comprehensive report"""

prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a research synthesizer. Combine findings into a comprehensive report."),
        ("human", """Create a comprehensive research report.

Web Research:
        {web_results}

Academic Papers:
        {paper_summaries}

Code Examples:
        {code_examples}
        """)
    ])

chain = prompt | llm
    report = chain.invoke({
        "web_results": "\n".join(state["web_results"]),
        "paper_summaries": "\n".join(state["paper_summaries"]),
        "code_examples": "\n".join(state["code_examples"])
    })

state["final_report"] = report.content
    return state

Build the research workflow
research_workflow = StateGraph(ResearchState)

Add all researcher nodes
research_workflow.add_node("web_researcher", web_researcher)
research_workflow.add_node("academic_researcher", academic_researcher)
research_workflow.add_node("code_researcher", code_researcher)
research_workflow.add_node("synthesizer", report_synthesizer)

Parallel research execution
research_workflow.add_edge("web_researcher", "synthesizer")
research_workflow.add_edge("academic_researcher", "synthesizer")
research_workflow.add_edge("code_researcher", "synthesizer")
research_workflow.add_edge("synthesizer", END)

Set entry point (coordinator)
research_workflow.set_entry_point("web_researcher")

Compile
research_app = research_workflow.compile()

Execute
result = research_app.invoke({
    "query": "LangGraph multi-agent patterns",
    "web_results": [],
    "paper_summaries": [],
    "code_examples": [],
    "final_report": ""
})

print(result["final_report"])

Best Practices for Multi-Agent Systems

1. Design Clear Agent Responsibilities

Each agent should have a single, well-defined purpose:

Good: Clear, focused responsibility
def researcher_agent(state):
    """Only responsible for research"""
    pass

Bad: Too many responsibilities
def do_everything_agent(state):
    """Research, write, edit, format, publish..."""
    pass

2. Implement Proper Error Handling

def robust_agent(state: AgentState) -> AgentState:
    """Agent with proper error handling"""
    try:
        result = perform_task(state)
        state["status"] = "success"
        return state
    except Exception as e:
        state["status"] = "error"
        state["error_message"] = str(e)
        state["next_agent"] = "error_handler"
        return state

3. Use Type Hints and Validation

from pydantic import BaseModel, Field

class ValidatedState(BaseModel):
    query: str = Field(min_length=1, max_length=500)
    results: List[str] = Field(default_factory=list)
    confidence: float = Field(ge=0.0, le=1.0)

4. Monitor and Log Agent Activities

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def monitored_agent(state: AgentState) -> AgentState:
    logger.info(f"Agent started with state: {state}")

# Perform work
    result = do_work(state)

logger.info(f"Agent completed. Updated state: {result}")
    return result

5. Implement Human-in-the-Loop

def human_review_needed(state: BlogState) -> str:
    """Check if human review is required"""
    if state["revision_count"] > 3:
        return "human_review"
    return "continue"

workflow.add_conditional_edges(
    "editor",
    human_review_needed,
    {
        "human_review": "await_human_input",
        "continue": END
    }
)

Common Pitfalls and Solutions

Pitfall 1: State Explosion

Problem: State grows too large with unnecessary data

Solution: Keep state minimal and focused

Bad: Storing everything
class BadState(TypedDict):
    all_intermediate_results: List[dict]
    every_api_call: List[dict]
    complete_conversation_history: List[str]

Good: Only essential data
class GoodState(TypedDict):
    current_task: str
    final_result: str
    next_action: str

Pitfall 2: Circular Dependencies

Problem: Agents can get stuck in loops

Solution: Add loop counters and exit conditions

def check_loop_limit(state: AgentState) -> str:
    if state.get("loop_count", 0) > 5:
        return END
    return "continue"

Pitfall 3: Poor Error Recovery

Problem: One agent failure breaks entire system

Solution: Implement fallback strategies

def fallback_router(state: AgentState) -> str:
    if state["status"] == "error":
        if state["retry_count"] < 3:
            return "retry"
        return "fallback_agent"
    return "next_agent"

Performance Optimization

1. Parallel Execution Where Possible

Independent agents can run in parallel
workflow.add_edge("coordinator", "agent_1")
workflow.add_edge("coordinator", "agent_2")
workflow.add_edge("coordinator", "agent_3")

2. Use Smaller Models for Simple Tasks

Use GPT-3.5 for simple tasks
simple_llm = ChatOpenAI(model="gpt-3.5-turbo")

Use GPT-4 only for complex reasoning
complex_llm = ChatOpenAI(model="gpt-4")

3. Implement Caching

from langchain.cache import InMemoryCache
from langchain.globals import set_llm_cache

set_llm_cache(InMemoryCache())

Conclusion

LangGraph transforms multi-agent system development by providing a robust, graph-based framework for orchestrating complex AI workflows. Its stateful architecture, conditional routing, and built-in persistence make it ideal for production applications.

Key takeaways:
- LangGraph uses graph structures to model agent interactions
- State management is central to coordinating multiple agents
- Conditional edges enable dynamic, intelligent routing
- Parallel execution improves performance
- Proper error handling and monitoring are essential
- Start simple and add complexity as needed

Next steps:
1. Install LangGraph and try the examples in this article
2. Build a simple two-agent system for your use case
3. Gradually add more agents and complexity
4. Implement monitoring and error handling
5. Deploy to production with persistence

Multi-agent systems with LangGraph open up possibilities that single agents cannot achieve. Whether you're building research assistants, content creation pipelines, or complex automation workflows, LangGraph provides the tools you need to succeed.

Ready to build your multi-agent system? Start with the examples in this guide and adapt them to your specific needs. The future of AI is collaborative, and LangGraph is your toolkit for making it happen.

---