braintrust-tracing

2
0
Source

Braintrust tracing for Claude Code - hook architecture, sub-agent correlation, debugging

Install

mkdir -p .claude/skills/braintrust-tracing && curl -L -o skill.zip "https://mcp.directory/api/skills/download/3323" && unzip -o skill.zip -d .claude/skills/braintrust-tracing && rm skill.zip

Installs to .claude/skills/braintrust-tracing

About this skill

Braintrust Tracing for Claude Code

Comprehensive guide to tracing Claude Code sessions in Braintrust, including sub-agent correlation.

Architecture Overview

                         PARENT SESSION
                    +---------------------+
                    |  SessionStart       |
                    |  (creates root)     |
                    +----------+----------+
                               |
                    +----------v----------+
                    |  UserPromptSubmit   |
                    |  (creates Turn)     |
                    +----------+----------+
                               |
          +--------------------+--------------------+
          |                    |                    |
+---------v--------+  +--------v--------+  +--------v--------+
| PostToolUse      |  | PostToolUse     |  | PreToolUse      |
| (Read span)      |  | (Edit span)     |  | (Task - inject) |
+------------------+  +-----------------+  +--------+--------+
                                                    |
                                         +----------v----------+
                                         |   SUB-AGENT         |
                                         |   SessionStart      |
                                         |   (NEW root_span_id)|
                                         +----------+----------+
                                                    |
                                         +----------v----------+
                                         |   SubagentStop      |
                                         |   (has session_id)  |
                                         +---------------------+

Hook Event Flow

HookTriggerCreatesKey Fields
SessionStartSession beginsRoot spansession_id, root_span_id
UserPromptSubmitUser sends promptTurn spanprompt, turn_number
PreToolUseBefore tool runs(modifies Task prompts)tool_input.prompt
PostToolUseAfter tool runsTool spantool_name, input, output
StopTurn completesLLM spansmodel, tokens, tool_calls
SubagentStopSub-agent finishes(no span)session_id of sub-agent
SessionEndSession ends(finalizes root)turn_count, tool_count

Trace Hierarchy

Session (task span) - root_span_id = session_id
|
+-- Turn 1 (task span)
|   |
|   +-- claude-sonnet (llm span) - model call with tool_use
|   +-- Read (tool span)
|   +-- Edit (tool span)
|   +-- claude-sonnet (llm span) - response after tools
|
+-- Turn 2 (task span)
|   |
|   +-- claude-sonnet (llm span)
|   +-- Task (tool span) -----> [Sub-agent session - SEPARATE trace]
|   +-- claude-sonnet (llm span)
|
+-- Turn 3 ...

Sub-Agent Tracing: What Works and What Doesn't

What Doesn't Work

SessionStart doesn't receive the Task prompt.

We tried injecting trace context into Task prompts via PreToolUse:

# PreToolUse hook injects:
[BRAINTRUST_TRACE_CONTEXT]
{"root_span_id": "abc", "parent_span_id": "xyz", "project_id": "123"}
[/BRAINTRUST_TRACE_CONTEXT]

But SessionStart only receives session metadata, not the modified prompt. The injected context is lost.

What DOES Work

Task spans in parent session contain everything:

  • agentId - identifier for the sub-agent run
  • totalTokens, totalToolUseCount - metrics
  • content - full agent response/summary
  • tool_input.prompt - original task prompt
  • tool_input.subagent_type - agent type (e.g., "oracle")

SubagentStop hook receives the sub-agent's session_id:

  • This equals the sub-agent's orphaned trace root_span_id
  • Allows correlation between parent Task span and child trace

The Correlation Pattern

Current state: Sub-agents create orphaned traces (new root_span_id).

Correlation method:

  1. Query parent session's Task spans for agent metadata
  2. Match agentId or timing with orphaned traces
  3. Sub-agent's session_id = its trace's root_span_id

Future solution (not yet implemented):

SubagentStop fires -> writes session_id to temp file
PostToolUse (Task) -> reads temp file -> adds child_session_id to Task span metadata

This would link: Task.agentId + Task.child_session_id -> orphaned trace root_span_id

State Management

Per-Session State Files

~/.claude/state/braintrust_sessions/
  {session_id}.json       # Per-session state

Each session file contains:

{
  "root_span_id": "abc-123",
  "project_id": "proj-456",
  "turn_count": 5,
  "tool_count": 23,
  "current_turn_span_id": "turn-789",
  "current_turn_start": 1703456789,
  "started": "2025-12-24T10:00:00.000Z",
  "is_subagent": false
}

Global State

~/.claude/state/braintrust_global.json   # Cached project_id
~/.claude/state/braintrust_hook.log      # Debug log

Debugging Commands

Check if Tracing is Active

# View hook logs in real-time
tail -f ~/.claude/state/braintrust_hook.log

# Check if session has state
cat ~/.claude/state/braintrust_sessions/*.json | jq -s '.'

# Verify environment
echo "TRACE_TO_BRAINTRUST=$TRACE_TO_BRAINTRUST"
echo "BRAINTRUST_API_KEY=${BRAINTRUST_API_KEY:+set}"

Query Braintrust Directly

# List recent sessions
uv run python -m runtime.harness scripts/braintrust_analyze.py --sessions 5

# Analyze last session
uv run python -m runtime.harness scripts/braintrust_analyze.py --last-session

# Replay specific session
uv run python -m runtime.harness scripts/braintrust_analyze.py --replay <session-id>

# Find sub-agent traces (orphaned roots)
uv run python -m runtime.harness scripts/braintrust_analyze.py --agent-stats

Debug Hook Execution

# Enable verbose logging
export BRAINTRUST_CC_DEBUG=true

# Test hooks manually
echo '{"session_id":"test-123","type":"resume"}' | \
  bash "$CLAUDE_PROJECT_DIR/.claude/plugins/braintrust-tracing/hooks/session_start.sh"

# Test PreToolUse (Task injection)
echo '{"session_id":"test-123","tool_name":"Task","tool_input":{"prompt":"test"}}' | \
  bash "$CLAUDE_PROJECT_DIR/.claude/plugins/braintrust-tracing/hooks/pre_tool_use.sh"

Troubleshooting Checklist

  1. No traces appearing:

    • Check TRACE_TO_BRAINTRUST=true in .claude/settings.local.json
    • Verify API key: echo $BRAINTRUST_API_KEY
    • Check logs: tail -20 ~/.claude/state/braintrust_hook.log
  2. Sub-agents not linking:

    • This is expected - sub-agents create orphaned traces
    • Use --agent-stats to find agent activity
    • Correlate via timing or agentId in parent Task span
  3. Missing spans:

    • Check current_turn_span_id in session state
    • Ensure Stop hook runs (turn finalization)
    • Look for "Failed to create" errors in log
  4. State corruption:

    • Remove session state: rm ~/.claude/state/braintrust_sessions/*.json
    • Clear global cache: rm ~/.claude/state/braintrust_global.json

Key Files

FilePurpose
.claude/plugins/braintrust-tracing/hooks/common.shShared utilities, API, state management
.claude/plugins/braintrust-tracing/hooks/session_start.shCreates root span, handles sub-agent context
.claude/plugins/braintrust-tracing/hooks/user_prompt_submit.shCreates Turn spans per user message
.claude/plugins/braintrust-tracing/hooks/pre_tool_use.shInjects trace context into Task prompts
.claude/plugins/braintrust-tracing/hooks/post_tool_use.shCreates tool spans, captures agent/skill metadata
.claude/plugins/braintrust-tracing/hooks/stop_hook.shCreates LLM spans, finalizes Turns
.claude/plugins/braintrust-tracing/hooks/session_end.shFinalizes session, triggers learning extraction
scripts/braintrust_analyze.pyQuery and analyze traced sessions
~/.claude/state/braintrust_sessions/Per-session state files
~/.claude/state/braintrust_hook.logDebug log

Environment Variables

VariableRequiredDefaultDescription
TRACE_TO_BRAINTRUSTYes-Set to "true" to enable
BRAINTRUST_API_KEYYes-API key for Braintrust
BRAINTRUST_CC_PROJECTNoclaude-codeProject name
BRAINTRUST_CC_DEBUGNofalseVerbose logging
BRAINTRUST_API_URLNohttps://api.braintrust.devAPI endpoint

Session Learnings

What We Learned About Sub-Agent Tracing (Dec 2025)

Attempted: Inject trace context via PreToolUse into Task prompts.

Result: Failed - SessionStart only receives session metadata, not the prompt.

Discovery: Task spans already contain rich sub-agent data:

  • metadata.agent_type - agent type from subagent_type
  • metadata.skill_name - skill from Skill tool
  • tool_input - full prompt sent to agent
  • tool_output - agent response

Current correlation path:

  1. Parent session Task span has agentId and timing
  2. Sub-agent creates orphaned trace with root_span_id = session_id
  3. SubagentStop provides the sub-agent's session_id
  4. Manual correlation: match timing or use session_id link

Future work: Write child_session_id to Task span metadata from PostToolUse after SubagentStop.

What We Learned About Sub-Agent Correlation

The Problem

  • Sub-agents spawned via Task tool create orphaned Braintrust traces
  • Parent session has Task spans with agentId, sub-agent has separate session_id
  • No built-in link between them

What DOESN'T Work

1. Prompt injection via PreToolUse

SessionStart hook only receives session metadata (session_id, type, cwd), NOT the prompt. Injected trace context is never seen.

The hook receives:

{
  "session_id": "...",
  "type": "start|resume|compact|clear",
  "cwd": "...",
  "env": {...}
}

No prompt field exists - context injection is impossible at SessionStart.

2. SubagentStop → PostToolUse file handoff

Race condition. The


Content truncated.

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

643969

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

591705

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318398

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

339397

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

451339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.