wrap_llm() wraps LLM SDK clients directly. Every LLM generation call and tool use block in the response is automatically traced — no callback wiring needed.
When to use this
| Scenario | Use |
|---|---|
| Agent built with LangChain/LangGraph | LangChain callback |
| Agent using raw OpenAI/Anthropic/Gemini SDK | wrap_llm() — see pages below |
| Agent using MCP sessions directly | wrap() |
Supported SDKs
Gemini SDK
google.genai.Client — new SDK with function callingOpenAI SDK
openai.OpenAI / AsyncOpenAI — chat completions + toolsAnthropic SDK
anthropic.Anthropic / AsyncAnthropic — messages + tool_useHow it works
wrap_llm() auto-detects the SDK based on the client class:
| Class | Detected SDK |
|---|---|
openai.OpenAI, openai.AsyncOpenAI | OpenAI |
anthropic.Anthropic, anthropic.AsyncAnthropic | Anthropic |
google.genai.Client | Gemini (new SDK) |
google.generativeai.GenerativeModel | Gemini (legacy) |
| Anything else | Returned unchanged (fail-open) |
What gets traced
For each generation call, LangSight emits:- LLM generation span — model, tokens, latency,
span_type="agent" - Tool call spans (one per tool use) — tool name, arguments,
parent_span_idlinking to the LLM span
Combining with MCP tracing
wrap_llm() and wrap() work independently. Use both when your agent calls LLM APIs directly and also uses MCP servers:
session_id, so all spans appear in a single session trace.