massgen-log-analyzer
Run MassGen experiments and analyze logs using automation mode, logfire tracing, and SQL queries. Use this skill for performance analysis, debugging agent behavior, evaluating coordination patterns, and improving the logging structure, or whenever an ANALYSIS_REPORT.md is needed in a log directory.
Install
mkdir -p .claude/skills/massgen-log-analyzer && curl -L -o skill.zip "https://mcp.directory/api/skills/download/5606" && unzip -o skill.zip -d .claude/skills/massgen-log-analyzer && rm skill.zipInstalls to .claude/skills/massgen-log-analyzer
About this skill
MassGen Log Analyzer
This skill provides a structured workflow for running MassGen experiments and analyzing the resulting traces and logs using Logfire.
Purpose
The log-analyzer skill helps you:
- Run MassGen experiments with proper instrumentation
- Query and analyze traces hierarchically
- Debug agent behavior and coordination patterns
- Measure performance and identify bottlenecks
- Improve the logging structure itself
- Generate markdown analysis reports saved to the log directory
CLI Quick Reference
The massgen logs CLI provides quick access to log analysis:
List Logs with Analysis Status
uv run massgen logs list # Show all recent logs with analysis status
uv run massgen logs list --analyzed # Only logs with ANALYSIS_REPORT.md
uv run massgen logs list --unanalyzed # Only logs needing analysis
uv run massgen logs list --limit 20 # Show more logs
Generate Analysis Prompt
# Run from within your coding CLI (e.g., Claude Code) so it sees output
uv run massgen logs analyze # Analyze latest turn of latest log
uv run massgen logs analyze --log-dir PATH # Analyze specific log
uv run massgen logs analyze --turn 1 # Analyze specific turn
The prompt output tells your coding CLI to use this skill on the specified log directory.
Multi-Agent Self-Analysis
uv run massgen logs analyze --mode self # Run 3-agent analysis team (prompts if report exists)
uv run massgen logs analyze --mode self --force # Overwrite existing report without prompting
uv run massgen logs analyze --mode self --turn 2 # Analyze specific turn
uv run massgen logs analyze --mode self --config PATH # Use custom config
Self-analysis mode runs MassGen with multiple agents to analyze logs from different perspectives (correctness, efficiency, behavior) and produces a combined ANALYSIS_REPORT.md.
Multi-Turn Sessions
MassGen log directories support multiple turns (coordination sessions). Each turn has its own turn_N/ directory with attempts inside:
log_YYYYMMDD_HHMMSS/
├── turn_1/ # First coordination session
│ ├── ANALYSIS_REPORT.md # Report for turn 1
│ ├── attempt_1/ # First attempt
│ └── attempt_2/ # Retry if orchestration restarted
├── turn_2/ # Second coordination session (if multi-turn)
│ ├── ANALYSIS_REPORT.md # Report for turn 2
│ └── attempt_1/
When analyzing, the --turn flag specifies which turn to analyze. Without it, the latest turn is analyzed.
When to Use Logfire vs Local Logs
Use Local Log Files When:
- Analyzing command patterns and repetition (commands are in
streaming_debug.log) - Checking detailed tool arguments and outputs (in
coordination_events.json) - Reading vote reasoning and agent decisions (in
agent_*/*/vote.json) - Viewing the coordination flow table (in
coordination_table.txt) - Getting cost/token summaries (in
metrics_summary.json)
Use Logfire When:
- You need precise timing data with millisecond accuracy
- Analyzing span hierarchy and parent-child relationships
- Finding exceptions and error stack traces
- Creating shareable trace links for collaboration
- Querying across multiple sessions (e.g., "find all sessions with errors")
- Real-time monitoring of running experiments
Rate Limiting: If Logfire returns a rate limit error, wait up to 60 seconds and retry rather than falling back to local logs. The rate limit resets quickly and Logfire data is worth waiting for when timing/hierarchy analysis is needed.
Key Local Log Files:
| File | Contains |
|---|---|
status.json | Real-time status with agent reliability metrics (enforcement events, buffer loss) |
metrics_summary.json | Cost, tokens, tool stats, round history |
coordination_events.json | Full event timeline with tool calls |
coordination_table.txt | Human-readable coordination flow |
streaming_debug.log | Raw streaming data including command strings |
agent_*/*/vote.json | Vote reasoning and context |
agent_*/*/execution_trace.md | Full tool calls, arguments, results, and reasoning - invaluable for debugging |
execution_metadata.yaml | Config and session metadata |
Execution Traces (execution_trace.md):
These are the most detailed debug artifacts. Each agent snapshot includes an execution trace with:
- Complete tool calls with full arguments (not truncated)
- Full tool results (not truncated)
- Reasoning/thinking blocks from the model
- Timestamps and round markers
Use execution traces when you need to understand exactly what an agent did and why - they capture everything the agent saw and produced during that answer/vote iteration.
Enforcement Reliability (status.json):
The status.json file includes per-agent reliability metrics that track workflow enforcement events:
{
"agents": {
"agent_a": {
"reliability": {
"enforcement_attempts": [
{
"round": 0,
"attempt": 1,
"max_attempts": 3,
"reason": "no_workflow_tool",
"tool_calls": ["search", "read_file"],
"error_message": "Must use workflow tools",
"buffer_preview": "First 500 chars of lost content...",
"buffer_chars": 1500,
"timestamp": 1736683468.123
}
],
"by_round": {"0": {"count": 2, "reasons": ["no_workflow_tool", "invalid_vote_id"]}},
"unknown_tools": ["execute_command"],
"workflow_errors": ["invalid_vote_id"],
"total_enforcement_retries": 2,
"total_buffer_chars_lost": 3000,
"outcome": "ok"
}
}
}
}
Enforcement Reason Codes:
| Reason | Description |
|---|---|
no_workflow_tool | Agent called tools but none were vote or new_answer |
no_tool_calls | Agent provided text-only response, no tools called |
invalid_vote_id | Agent voted for non-existent agent ID |
vote_no_answers | Agent tried to vote when no answers exist |
vote_and_answer | Agent used both vote and new_answer in same response |
answer_limit | Agent hit max answer count limit |
answer_novelty | Answer too similar to existing answers |
answer_duplicate | Exact duplicate of existing answer |
api_error | API/streaming error (e.g., "peer closed connection") |
connection_recovery | API stream ended early, recovered with preserved context |
mcp_disconnected | MCP server disconnected mid-session (e.g., "Server 'X' not connected") |
This data is invaluable for understanding why agents needed retries and how much content was lost due to enforcement restarts.
Logfire Setup
Before using this skill, you need to set up Logfire for observability.
Step 1: Install MassGen with Observability Support
pip install "massgen[observability]"
# Or with uv
uv pip install "massgen[observability]"
Step 2: Create a Logfire Account
Go to https://logfire.pydantic.dev/ and create a free account.
Step 3: Authenticate with Logfire
# This creates ~/.logfire/credentials.json
uv run logfire auth
# Or set the token directly as an environment variable
export LOGFIRE_TOKEN=your_token_here
Step 4: Get Your Read Token for the MCP Server
- Go to https://logfire.pydantic.dev/ and log in
- Navigate to your project settings
- Create a Read Token (this is different from the write token used for authentication)
- Copy the token for use in Step 5
Step 5: Add the Logfire MCP Server
claude mcp add logfire -e LOGFIRE_READ_TOKEN="your-read-token-here" -- uvx logfire-mcp@latest
Then restart Claude Code and re-invoke this skill.
Prerequisites
Logfire MCP Server (Optional but Recommended):
The Logfire MCP server provides enhanced analysis with precise timing data and cross-session queries. If LOGFIRE_READ_TOKEN is not set, self-analysis mode will automatically disable the Logfire MCP and fall back to local log files only.
When configured, the MCP server provides these tools:
mcp__logfire__arbitrary_query- Run SQL queries against logfire datamcp__logfire__schema_reference- Get the database schemamcp__logfire__find_exceptions_in_file- Find exceptions in a filemcp__logfire__logfire_link- Create links to traces in the UI
Required Flags:
--automation- Clean output for programmatic parsing -- seemassgen-develops-massgenskill for more info on this flag--logfire- Enable Logfire tracing (optional, but required to populate Logfire data)
Part 1: Running MassGen Experiments
Basic Command Format
uv run massgen --automation --logfire --config [config_file] "[question]"
Running in Background (Recommended)
Use run_in_background: true (or however you run tasks in the background) to run experiments asynchronously so you can monitor progress and end early if needed.
Expected Output (first lines):
LOG_DIR: .massgen/massgen_logs/log_YYYYMMDD_HHMMSS_ffffff
STATUS: .massgen/massgen_logs/log_YYYYMMDD_HHMMSS_ffffff/turn_1/attempt_1/status.json
QUESTION: Your task here
[Coordination in progress - monitor status.json for real-time updates]
Parse the LOG_DIR - you'll need this for file-based analysis!
Monitoring Progress
status.json updates every 2 seconds; use that to track progress.
cat [log_dir]/turn_1/attempt_1/status.json
Key fields to monitor:
coordination.completion_percentage(0-100)coordination.phase- "initial_answer", "enforcement", "presentation"results.winner- null while running, agent_id when completeagents[].status- "waiting", "streaming", "answered", "voted", "error"agents[].error- null if ok, error details if failed
Reading Final Results
After completion (exit code 0):
# Read the final answer
cat [
---
*Content truncated.*
More by massgen
View all skills by massgen →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversGalileo: Integrate with Galileo to create datasets, manage prompt templates, run experiments, analyze logs, and monitor
Optimize your codebase for AI with Repomix—transform, compress, and secure repos for easier analysis with modern AI tool
Analyze lrcx stock in real-time with Investor Agent using yfinance and CNN data for portfolio and market sentiment insig
Integrate with Plane for automated project and workflow management. Streamline software workflow tasks using robust work
SEO Research MCP brings powerful SEO research capabilities directly into your AI coding assistant. Using the Model Conte
Securely query and analyze your Google BigQuery datasets using natural language with BigQuery for fast, easy data insigh
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.