massgen-log-analyzer

0
0
Source

Run MassGen experiments and analyze logs using automation mode, logfire tracing, and SQL queries. Use this skill for performance analysis, debugging agent behavior, evaluating coordination patterns, and improving the logging structure, or whenever an ANALYSIS_REPORT.md is needed in a log directory.

Install

mkdir -p .claude/skills/massgen-log-analyzer && curl -L -o skill.zip "https://mcp.directory/api/skills/download/5606" && unzip -o skill.zip -d .claude/skills/massgen-log-analyzer && rm skill.zip

Installs to .claude/skills/massgen-log-analyzer

About this skill

MassGen Log Analyzer

This skill provides a structured workflow for running MassGen experiments and analyzing the resulting traces and logs using Logfire.

Purpose

The log-analyzer skill helps you:

  • Run MassGen experiments with proper instrumentation
  • Query and analyze traces hierarchically
  • Debug agent behavior and coordination patterns
  • Measure performance and identify bottlenecks
  • Improve the logging structure itself
  • Generate markdown analysis reports saved to the log directory

CLI Quick Reference

The massgen logs CLI provides quick access to log analysis:

List Logs with Analysis Status

uv run massgen logs list                    # Show all recent logs with analysis status
uv run massgen logs list --analyzed         # Only logs with ANALYSIS_REPORT.md
uv run massgen logs list --unanalyzed       # Only logs needing analysis
uv run massgen logs list --limit 20         # Show more logs

Generate Analysis Prompt

# Run from within your coding CLI (e.g., Claude Code) so it sees output
uv run massgen logs analyze                 # Analyze latest turn of latest log
uv run massgen logs analyze --log-dir PATH  # Analyze specific log
uv run massgen logs analyze --turn 1        # Analyze specific turn

The prompt output tells your coding CLI to use this skill on the specified log directory.

Multi-Agent Self-Analysis

uv run massgen logs analyze --mode self                 # Run 3-agent analysis team (prompts if report exists)
uv run massgen logs analyze --mode self --force         # Overwrite existing report without prompting
uv run massgen logs analyze --mode self --turn 2        # Analyze specific turn
uv run massgen logs analyze --mode self --config PATH   # Use custom config

Self-analysis mode runs MassGen with multiple agents to analyze logs from different perspectives (correctness, efficiency, behavior) and produces a combined ANALYSIS_REPORT.md.

Multi-Turn Sessions

MassGen log directories support multiple turns (coordination sessions). Each turn has its own turn_N/ directory with attempts inside:

log_YYYYMMDD_HHMMSS/
├── turn_1/                    # First coordination session
│   ├── ANALYSIS_REPORT.md     # Report for turn 1
│   ├── attempt_1/             # First attempt
│   └── attempt_2/             # Retry if orchestration restarted
├── turn_2/                    # Second coordination session (if multi-turn)
│   ├── ANALYSIS_REPORT.md     # Report for turn 2
│   └── attempt_1/

When analyzing, the --turn flag specifies which turn to analyze. Without it, the latest turn is analyzed.

When to Use Logfire vs Local Logs

Use Local Log Files When:

  • Analyzing command patterns and repetition (commands are in streaming_debug.log)
  • Checking detailed tool arguments and outputs (in coordination_events.json)
  • Reading vote reasoning and agent decisions (in agent_*/*/vote.json)
  • Viewing the coordination flow table (in coordination_table.txt)
  • Getting cost/token summaries (in metrics_summary.json)

Use Logfire When:

  • You need precise timing data with millisecond accuracy
  • Analyzing span hierarchy and parent-child relationships
  • Finding exceptions and error stack traces
  • Creating shareable trace links for collaboration
  • Querying across multiple sessions (e.g., "find all sessions with errors")
  • Real-time monitoring of running experiments

Rate Limiting: If Logfire returns a rate limit error, wait up to 60 seconds and retry rather than falling back to local logs. The rate limit resets quickly and Logfire data is worth waiting for when timing/hierarchy analysis is needed.

Key Local Log Files:

FileContains
status.jsonReal-time status with agent reliability metrics (enforcement events, buffer loss)
metrics_summary.jsonCost, tokens, tool stats, round history
coordination_events.jsonFull event timeline with tool calls
coordination_table.txtHuman-readable coordination flow
streaming_debug.logRaw streaming data including command strings
agent_*/*/vote.jsonVote reasoning and context
agent_*/*/execution_trace.mdFull tool calls, arguments, results, and reasoning - invaluable for debugging
execution_metadata.yamlConfig and session metadata

Execution Traces (execution_trace.md): These are the most detailed debug artifacts. Each agent snapshot includes an execution trace with:

  • Complete tool calls with full arguments (not truncated)
  • Full tool results (not truncated)
  • Reasoning/thinking blocks from the model
  • Timestamps and round markers

Use execution traces when you need to understand exactly what an agent did and why - they capture everything the agent saw and produced during that answer/vote iteration.

Enforcement Reliability (status.json): The status.json file includes per-agent reliability metrics that track workflow enforcement events:

{
  "agents": {
    "agent_a": {
      "reliability": {
        "enforcement_attempts": [
          {
            "round": 0,
            "attempt": 1,
            "max_attempts": 3,
            "reason": "no_workflow_tool",
            "tool_calls": ["search", "read_file"],
            "error_message": "Must use workflow tools",
            "buffer_preview": "First 500 chars of lost content...",
            "buffer_chars": 1500,
            "timestamp": 1736683468.123
          }
        ],
        "by_round": {"0": {"count": 2, "reasons": ["no_workflow_tool", "invalid_vote_id"]}},
        "unknown_tools": ["execute_command"],
        "workflow_errors": ["invalid_vote_id"],
        "total_enforcement_retries": 2,
        "total_buffer_chars_lost": 3000,
        "outcome": "ok"
      }
    }
  }
}

Enforcement Reason Codes:

ReasonDescription
no_workflow_toolAgent called tools but none were vote or new_answer
no_tool_callsAgent provided text-only response, no tools called
invalid_vote_idAgent voted for non-existent agent ID
vote_no_answersAgent tried to vote when no answers exist
vote_and_answerAgent used both vote and new_answer in same response
answer_limitAgent hit max answer count limit
answer_noveltyAnswer too similar to existing answers
answer_duplicateExact duplicate of existing answer
api_errorAPI/streaming error (e.g., "peer closed connection")
connection_recoveryAPI stream ended early, recovered with preserved context
mcp_disconnectedMCP server disconnected mid-session (e.g., "Server 'X' not connected")

This data is invaluable for understanding why agents needed retries and how much content was lost due to enforcement restarts.

Logfire Setup

Before using this skill, you need to set up Logfire for observability.

Step 1: Install MassGen with Observability Support

pip install "massgen[observability]"

# Or with uv
uv pip install "massgen[observability]"

Step 2: Create a Logfire Account

Go to https://logfire.pydantic.dev/ and create a free account.

Step 3: Authenticate with Logfire

# This creates ~/.logfire/credentials.json
uv run logfire auth

# Or set the token directly as an environment variable
export LOGFIRE_TOKEN=your_token_here

Step 4: Get Your Read Token for the MCP Server

  1. Go to https://logfire.pydantic.dev/ and log in
  2. Navigate to your project settings
  3. Create a Read Token (this is different from the write token used for authentication)
  4. Copy the token for use in Step 5

Step 5: Add the Logfire MCP Server

claude mcp add logfire -e LOGFIRE_READ_TOKEN="your-read-token-here" -- uvx logfire-mcp@latest

Then restart Claude Code and re-invoke this skill.

Prerequisites

Logfire MCP Server (Optional but Recommended): The Logfire MCP server provides enhanced analysis with precise timing data and cross-session queries. If LOGFIRE_READ_TOKEN is not set, self-analysis mode will automatically disable the Logfire MCP and fall back to local log files only.

When configured, the MCP server provides these tools:

  • mcp__logfire__arbitrary_query - Run SQL queries against logfire data
  • mcp__logfire__schema_reference - Get the database schema
  • mcp__logfire__find_exceptions_in_file - Find exceptions in a file
  • mcp__logfire__logfire_link - Create links to traces in the UI

Required Flags:

  • --automation - Clean output for programmatic parsing -- see massgen-develops-massgen skill for more info on this flag
  • --logfire - Enable Logfire tracing (optional, but required to populate Logfire data)

Part 1: Running MassGen Experiments

Basic Command Format

uv run massgen --automation --logfire --config [config_file] "[question]"

Running in Background (Recommended)

Use run_in_background: true (or however you run tasks in the background) to run experiments asynchronously so you can monitor progress and end early if needed.

Expected Output (first lines):

LOG_DIR: .massgen/massgen_logs/log_YYYYMMDD_HHMMSS_ffffff
STATUS: .massgen/massgen_logs/log_YYYYMMDD_HHMMSS_ffffff/turn_1/attempt_1/status.json
QUESTION: Your task here
[Coordination in progress - monitor status.json for real-time updates]

Parse the LOG_DIR - you'll need this for file-based analysis!

Monitoring Progress

status.json updates every 2 seconds; use that to track progress.

cat [log_dir]/turn_1/attempt_1/status.json

Key fields to monitor:

  • coordination.completion_percentage (0-100)
  • coordination.phase - "initial_answer", "enforcement", "presentation"
  • results.winner - null while running, agent_id when complete
  • agents[].status - "waiting", "streaming", "answered", "voted", "error"
  • agents[].error - null if ok, error details if failed

Reading Final Results

After completion (exit code 0):

# Read the final answer
cat [

---

*Content truncated.*

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

643969

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

591705

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318398

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

339397

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

451339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.