Navigating the AI Agent Framework Landscape: LangGraph, LlamaIndex Workflows, or Pure Code?

3 min readFeb 24, 2025

2025 is shaping up to be the year of AI Agents, moving beyond simple RAG and into autonomous AI systems. As developers, we’re faced with a crucial decision: which framework (if any) should we use to build these agents? The choices range from established options like LangGraph to newer entrants like LlamaIndex Workflows, collective orchestration tools like CrewAI, or even the roll your own pure code approach.

This article breaks down the strengths and weaknesses of each approach based on building the same agent across different frameworks.

The Framework Face-Off

1. Pure Code: The DIY Approach

Starting with a pure code approach offers the most control and understanding.

Architecture: An OpenAI-powered router uses function calling to select the appropriate skill. A SkillMap organizes skills as individual classes, making it easy to add new skills without disrupting the router code.
Challenges: Crafting the router’s system prompt was difficult, as was managing the different output formats from each step.
Verdict: A good baseline, offering simplicity but requiring careful management of complexity as the agent grows.

2. LangGraph: Structure and Abstraction

LangGraph, built on Langchain, uses a Pregel graph structure with nodes and edges to define agent loops.

Architecture: A graph structure with nodes for OpenAI calls (“agent”) and tool handling (“tools”). LangGraph has a built-in object called ToolNode that takes a list of callable tools and triggers them based on a ChatMessage response, before returning to the “agent” node again.
Challenges: Tight integration with Langchain requires using Langchain objects, potentially requiring refactoring of existing code. Debugging can be difficult due to confusing error messages and abstracted concepts.
Benefits: The graph structure promotes clean and accessible code, especially for complex logic. Seamless integration for existing Langchain applications.
Verdict: Great if you’re already in the Langchain ecosystem, but be prepared for debugging challenges if you deviate from the framework’s conventions.

3. LlamaIndex Workflows: Flexible and Event-Driven

Workflows aims for easier agent construction with a focus on asynchronous execution.

Architecture: Steps (similar to LangGraph nodes) contain logic, and events are emitted and received to move between steps.
Challenges: Designed for asynchronous execution, which can complicate synchronous debugging. Similar to LangGraph, can encounter Pydantic validation errors.
Benefits: Lightweight and doesn’t impose excessive structure. The event-based architecture is beneficial for complex, asynchronous applications. Skills from the code-based agent can be used with Workflows without changes.
Verdict: A good middle ground, offering flexibility and an event-driven approach without being overly prescriptive.

Observability is Key

Regardless of the framework you choose, observability is crucial. Using tools that allow you to inspect the full run of your agent and narrow down where problems are arising from can significantly ease the debugging process. Pheonix Arize can help there.

Which Framework Should You Choose?

Each approach offers distinct advantages. The pure code approach is simplest but may become messy as complexity increases. LangGraph provides structure but demands adherence to its conventions. Workflows balances flexibility with an event-driven architecture.

Ultimately, the choice depends on your existing stack. If you’re already using LlamaIndex or Langchain, the corresponding framework (Workflows or LangGraph) might be the most logical choice. The benefits of the agent-specific framework alone may not be enough to warrant switching.