massgen-develops-massgen

2views

1installs

Guide for using MassGen to develop and improve itself. This skill should be used when agents need to run MassGen experiments programmatically (using automation mode) OR analyze terminal UI/UX quality (using visual evaluation tools). These are mutually exclusive workflows for different improvement goals.

Install

mkdir -p .claude/skills/massgen-develops-massgen && curl -L -o skill.zip "https://mcp.directory/api/skills/download/6070" && unzip -o skill.zip -d .claude/skills/massgen-develops-massgen && rm skill.zip

Installs to .claude/skills/massgen-develops-massgen

About this skill

MassGen Develops MassGen

This skill provides guidance for using MassGen to develop and improve itself. Choose the appropriate workflow based on what you're testing.

Two Workflows

Automation Mode - Test backend functionality, coordination logic, agent responses
Visual Evaluation - Test terminal display, colors, layout, UX

Workflow 1: Automation Mode

Use this to test functionality without visual inspection. Ideal for programmatic testing.

Running MassGen with Automation

Run MassGen in the background (exact mechanism depends on your tooling):

uv run massgen --automation --config massgen/configs/basic/multi/two_agents_gemini.yaml "What is 2+2?"

For MassGen agents: Use custom_tool__start_background_tool targeting mcp__command_line__execute_command, then poll with custom_tool__get_background_tool_status / custom_tool__get_background_tool_result. For Claude Code: Use Bash tool's run_in_background parameter.

Why Automation Mode

Feature	Benefit
Clean output	~10 parseable lines vs 3,000+ ANSI codes
LOG_DIR printed	First line shows log directory path
status.json	Real-time monitoring file
Exit codes	0=success, 1=config, 2=execution, 3=timeout, 4=interrupted
Workspace isolation	Safe parallel execution

Expected Output

LOG_DIR: .massgen/massgen_logs/log_20251120_143022_123456
STATUS: .massgen/massgen_logs/log_20251120_143022_123456/status.json

🤖 Multi-Agent Mode
Agents: gemini-2.5-pro1, gemini-2.5-pro2
Question: What is 2+2?

============================================================
QUESTION: What is 2+2?
[Coordination in progress - monitor status.json for real-time updates]

WINNER: gemini-2.5-pro1
DURATION: 33.4s
ANSWER_PREVIEW: The answer is 4.

COMPLETED: 2 agents, 35.2s total

Parse LOG_DIR from the first line to find the log directory.

Monitoring Progress

Read the status.json file (updated every 2 seconds):

cat .massgen/massgen_logs/log_20251120_143022_123456/status.json

Key fields:

{
  "coordination": {
    "completion_percentage": 65,
    "phase": "enforcement"
  },
  "results": {
    "winner": null  // null = running, "agent_id" = done
  },
  "agents": {
    "agent_a": {
      "status": "streaming",
      "error": null
    }
  }
}

Agent status values: waiting, streaming, answered, voted, completed, error

Reading Results

After completion (exit code 0):

# Read final answer
cat [log_dir]/final/[winner]/answer.txt

Timing Expectations

Standard tasks: 2-10 minutes
Complex/meta tasks: 10-30 minutes
Check if stuck: Read status.json - if completion_percentage increases, it's working

Advanced: Multiple Background Monitors

You can create multiple background monitoring tasks that run independently alongside the main MassGen process. Each monitor can track different aspects and write to separate log files for later inspection.

Approach

Create small Python scripts that run in background shells. Each script:

Monitors a specific aspect (tokens, errors, progress, coordination, etc.)
Writes timestamped data to its own log file
Runs in a loop with sleep() intervals
Can be checked anytime without blocking the main task

Example Monitor Scripts

Token Usage Monitor (token_monitor.py):

import json, time, sys
from pathlib import Path

log_dir = Path(sys.argv[1])  # Pass LOG_DIR as argument
while True:
    if (log_dir / "status.json").exists():
        with open(log_dir / "status.json") as f:
            data = json.load(f)
        with open("token_monitor.log", "a") as log:
            log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
            log.write(f"Tokens: {data.get('total_tokens_used', 0)}\n")
            log.write(f"Cost: ${data.get('total_cost', 0):.4f}\n\n")
    time.sleep(5)

Error Monitor (error_monitor.py):

import time, sys
from pathlib import Path

log_dir = Path(sys.argv[1])
while True:
    if log_dir.exists():
        with open("error_monitor.log", "a") as log:
            log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
            errors = []
            for logfile in log_dir.glob("*.log"):
                with open(logfile) as f:
                    for line in f:
                        if any(x in line.lower() for x in ['error', 'warning', 'failed']):
                            errors.append(line.strip())
            log.write('\n'.join(errors[-5:]) if errors else "No errors\n")
            log.write("\n")
    time.sleep(5)

Progress Monitor (progress_monitor.py):

import json, time, sys
from pathlib import Path

log_dir = Path(sys.argv[1])
while True:
    if (log_dir / "status.json").exists():
        with open(log_dir / "status.json") as f:
            data = json.load(f)
        with open("progress_monitor.log", "a") as log:
            log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
            progress = data.get('completion_percentage', 0)
            active = sum(1 for a in data.get('agents', {}).values()
                        if a.get('status') == 'active')
            log.write(f"Progress: {progress}% Active agents: {active}\n\n")
    time.sleep(5)

Coordination Monitor (coordination_monitor.py):

import json, time, sys
from pathlib import Path

log_dir = Path(sys.argv[1])
while True:
    if (log_dir / "status.json").exists():
        with open(log_dir / "status.json") as f:
            data = json.load(f)
        coord = data.get('coordination', {})
        with open("coordination_monitor.log", "a") as log:
            log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
            log.write(f"Phase: {coord.get('phase', 'unknown')}\n")
            log.write(f"Round: {coord.get('round', 0)}\n")
            log.write(f"Total answers: {coord.get('total_answers', 0)}\n\n")
    time.sleep(5)

Workflow

Launch main task, parse the LOG_DIR from output
Create monitor scripts as needed (write Python files)
Launch monitors in background shells: python3 token_monitor.py [LOG_DIR] &
Check monitor logs anytime by reading the .log files
When complete, kill monitor processes and analyze logs

Custom Monitors

Create monitors for any metric you want to track:

Model-specific performance metrics
Memory/context usage patterns
Real-time cost accumulation
Answer quality trends
Agent coordination patterns
Specific error categories

Benefits:

Non-blocking inspection of specific metrics on demand
Historical data captured for post-run analysis
Independent monitoring streams for different aspects
Easy to add new monitors without modifying configs

Workflow 2: Visual Evaluation

Use this to analyze and improve MassGen's terminal display quality. Requires tools from custom_tools/_multimodal_tools/.

Important: This workflow records the rich terminal display, so the actual recording does NOT use --automation mode. However, you should ALWAYS pre-test with --automation first.

Prerequisites

You should have these tools available in your workspace:

run_massgen_with_recording - Records terminal sessions as video
understand_video - Analyzes video frames with GPT-4.1 vision

Step 0: Pre-Test with Automation (REQUIRED)

Before recording the video, verify the config works and API keys are valid:

# Start with --automation to verify everything works
uv run massgen --automation --config [config_path] "[question]"

Wait 30-60 seconds (enough to verify API keys, config parsing, tool initialization), then kill the process.

Why this is critical:

Detects config errors before wasting recording time
Validates API keys are present and working
Ensures tools initialize correctly
Prevents recording a broken session

If the automation test fails, fix the issues before proceeding to recording.

Step 1: Record a MassGen Session

After the automation pre-test succeeds, record the visual session:

from custom_tools._multimodal_tools.run_massgen_with_recording import run_massgen_with_recording

result = await run_massgen_with_recording(
    config_path="massgen/configs/basic/multi/two_agents_gemini.yaml",
    question="What is 2+2?",
    output_format="mp4",  # ALWAYS use mp4 for maximum compatibility
    timeout_seconds=120,
    width=1920,
    height=1080
)

Format recommendation: Always use "mp4" for maximum compatibility. GIF and WebM are supported but MP4 is preferred.

The recording captures: Rich terminal display with colors, status indicators, coordination visualization (WITHOUT --automation flag).

Step 2: Analyze the Recording

Use understand_video to analyze the MP4 recording. Call it at least once, but as many as multiple times to analyze different aspects:

from custom_tools._multimodal_tools.understand_video import understand_video

# Overall UX evaluation
ux_eval = await understand_video(
    video_path=result["video_path"],  # The MP4 file from Step 1
    prompt="Evaluate the overall terminal display quality, clarity, and usability",
    num_frames=12
)

# Focused on coordination
coordination_eval = await understand_video(
    video_path=result["video_path"],
    prompt="How clearly does the display show agent coordination phases and voting?",
    num_frames=8
)

# Status indicators
status_eval = await understand_video(
    video_path=result["video_path"],
    prompt="Are status indicators (streaming, answered, voted) clear and visually distinct?",
    num_frames=8
)

Key points:

The recording tool saves the video to workspace - use that path for analysis
You can call understand_video multiple times on the same video with different prompts
Each call focuses on a specific aspect (UX, coordination, status, colors, etc.)

Evaluation Criteria

When analyzing terminal dis

Content truncated.

More by massgen

View all skills by massgen →

serena

massgen

This skill provides symbol-level code understanding and navigation using Language Server Protocol (LSP). Enables IDE-like capabilities for finding symbols, tracking references, and making precise code edits at the symbol level.

8710

file-search

massgen

This skill should be used when agents need to search codebases for text patterns or structural code patterns. Provides fast search using ripgrep for text and ast-grep for syntax-aware code search.

massgen-log-analyzer

massgen

Run MassGen experiments and analyze logs using automation mode, logfire tracing, and SQL queries. Use this skill for performance analysis, debugging agent behavior, evaluating coordination patterns, and improving the logging structure, or whenever an ANALYSIS_REPORT.md is needed in a log directory.

evolving-skill-creator

massgen

Guide for creating evolving skills - detailed workflow plans that capture what you'll do, what tools you'll create, and learnings from execution. Use this when starting a new task that could benefit from a reusable workflow.

massgen-release-documenter

massgen

Guide for following MassGen's release documentation workflow. This skill should be used when preparing release documentation, updating changelogs, writing case studies, or maintaining project documentation across releases.

release-prep

massgen

Prepare release documentation including CHANGELOG entry, announcement text, and validation. Run before tagging a new release.

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,6851,428

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

1,2641,324

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,5331,147

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

1,355809

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,264727

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,483684

Related MCP Servers

Browse all servers

pg-aiguide

pg-aiguide — Version-aware PostgreSQL docs and best practices tailored for AI coding assistants. Improve queries, migrat

1,5903 tools

Spec-Driven Development

Guide your software projects with structured prompts from requirements to code using the waterfall development model and

4270 tools

Specs Workflow

Streamline project docs with Specs Workflow: automate software project plan templates, tracking, and OpenAPI-driven prog

1281 tools

Mermaid Validator

Validate and render Mermaid diagrams as SVG images using Mermaid JS. Get clear error messages to improve your JavaScript

530 tools

Chain of Thought Task Manager

Organize projects using leading project track software. Convert tasks with dependency tracking for optimal time manageme

200 tools

Data Extractor

Data Extractor converts JavaScript and TypeScript code into JSON configuration files using JSON stringify for better mai

92 tools

Install

mkdir -p .claude/skills/massgen-develops-massgen && curl -L -o skill.zip "https://mcp.directory/api/skills/download/6070" && unzip -o skill.zip -d .claude/skills/massgen-develops-massgen && rm skill.zip

Installs to .claude/skills/massgen-develops-massgen

Stats

Views

Installs

Author

massgen

7 skills published

Links

Source Code

massgen-develops-massgen

Install

About this skill

MassGen Develops MassGen

Two Workflows

Workflow 1: Automation Mode

Running MassGen with Automation

Why Automation Mode

Expected Output

Monitoring Progress

Reading Results

Timing Expectations

Advanced: Multiple Background Monitors

Approach

Example Monitor Scripts

Workflow

Custom Monitors

Workflow 2: Visual Evaluation

Prerequisites

Step 0: Pre-Test with Automation (REQUIRED)

Step 1: Record a MassGen Session

Step 2: Analyze the Recording

Evaluation Criteria

More by massgen

serena

file-search

massgen-log-analyzer

evolving-skill-creator

massgen-release-documenter

release-prep

You might also like

flutter-development

ui-ux-pro-max

drawio-diagrams-enhanced

godot

nano-banana-pro

pdf-to-markdown

Related MCP Servers