Agent tracing and runtime observability with LangSmith
Instrument LangGraph agents with LangSmith to capture every LLM call, tool execution, and decision point — turning AI debugging from guesswork into data.
Use this file to discover all available pages before exploring further.
LangSmith acts as a flight recorder for your agents. Once you set LANGCHAIN_TRACING_V2=true, every LangGraph node execution, LLM call, and tool invocation is captured automatically — no manual instrumentation required. This page walks through environment setup, building a traceable agent, and reading the traces that LangSmith produces.
You need a free LangSmith account at smith.langchain.com and an OpenAI API key before starting.
Setting LANGCHAIN_TRACING_V2=true is the single switch that activates LangSmith. All LangGraph operations are intercepted and logged from this point forward.
import os# Configure API keys - replace with your actual keysos.environ['OPENAI_API_KEY'] = ''os.environ['LANGCHAIN_API_KEY'] = ''os.environ['LANGCHAIN_TRACING_V2'] = 'true' # This triggers observabilityos.environ['LANGCHAIN_PROJECT'] = 'langsmith-tutorial-demo'# Verify configurationrequired_vars = ['OPENAI_API_KEY', 'LANGCHAIN_API_KEY']for var in required_vars: if not os.getenv(var) or 'your_' in os.getenv(var, ''): print(f"Warning: {var} needs your actual key") else: print(f"✓ {var} configured")print(f"\nLangSmith Project: {os.getenv('LANGCHAIN_PROJECT')}")print("\nTracing is now active - all AI operations will be logged for analysis")
The LANGCHAIN_PROJECT variable groups traces into a named project in your dashboard. Use a descriptive name per environment (e.g., production, staging, dev).
Structured state gives LangSmith clear data to track as it flows through each node. Each field maps to a visible property in the trace viewer.
from typing import TypedDictclass AgentState(TypedDict): """Simple state that flows through our agent workflow.""" user_question: str # The original question from the user needs_search: bool # Whether we determined search is needed search_result: str # Result from our search tool (if used) final_answer: str # The response we'll give to the user reasoning: str # Why we made our decisions (great for observability)
Each function is a separate node. LangSmith shows which nodes ran and in what order for every invocation.
1
Decide whether to search
from langchain_core.messages import SystemMessagefrom langchain_openai import ChatOpenAIllm = ChatOpenAI(model="gpt-4o-mini", temperature=0)def decide_search_need(state: AgentState) -> AgentState: """Analyze the question and decide if we need to search for current information.""" decision_prompt = f""" Analyze this question and decide if it requires current/recent information: Question: "{state['user_question']}" Respond with exactly "SEARCH" or "DIRECT". Then on a new line, briefly explain your reasoning. """ response = llm.invoke([SystemMessage(content=decision_prompt)]) lines = response.content.strip().split('\n') decision = lines[0].strip() reasoning = lines[1] if len(lines) > 1 else "No reasoning provided" state["needs_search"] = decision == "SEARCH" state["reasoning"] = f"Decision: {decision}. Reasoning: {reasoning}" return state
2
Execute search when needed
def execute_search(state: AgentState) -> AgentState: """Execute search if needed, otherwise skip this step.""" if not state["needs_search"]: state["search_result"] = "No search performed" return state search_result = wikipedia_search.invoke({"query": state["user_question"]}) state["search_result"] = search_result return state
3
Generate the final response
def generate_response(state: AgentState) -> AgentState: """Generate the final response using all available information.""" if state["needs_search"] and "Search error" not in state.get("search_result", ""): context = f"Question: {state['user_question']}\n\nSearch Results: {state['search_result']}" response_prompt = f"Answer using both your knowledge and the search results.\n\n{context}" else: response_prompt = f"Answer this question: {state['user_question']}" response = llm.invoke([SystemMessage(content=response_prompt)]) state["final_answer"] = response.content return state
After running the tests, open your LangSmith dashboard and select the langsmith-tutorial-demo project. You will see:
Trace list
Trace detail
LLM calls
Tool executions
A table of all executions with input, latency, cost, and success status. Sort by latency to identify slow queries or filter by tag to compare question types.
A graph view of the workflow showing which nodes executed and how data flowed between them. Click any node to inspect its inputs, outputs, and duration.
The exact prompt sent to the model and the raw response, including token counts and estimated cost per call.
Every wikipedia_search invocation with the query, returned text, and elapsed time.
LangSmith aggregates latency, cost, and error rates across all runs. Use these aggregates to set alert thresholds before deploying to production.