smart-model-switching-glm
Auto-route tasks to the cheapest z.ai (GLM) model that works correctly. Three-tier progression: Flash → Standard → Plus/32B. Classify before responding. FLASH (default): factual Q&A, greetings, reminders, status checks, lookups, simple file ops, heartbeats, casual chat, 1–2 sentence tasks, cron jobs. ESCALATE TO STANDARD: code >10 lines, analysis, comparisons, planning, reports, multi-step reasoning, tables, long writing >3 paragraphs, summarization, research synthesis, most user conversations. ESCALATE TO PLUS/32B: architecture decisions, complex debugging, multi-file refactoring, strategic planning, nuanced judgment, deep research, critical production decisions. Rule: If a human needs >30 seconds of focused thinking, escalate. If Standard struggles with complexity, go to Plus/32B. Save major API costs by starting cheap and escalating only when needed.
Install
mkdir -p .claude/skills/smart-model-switching-glm && curl -L -o skill.zip "https://mcp.directory/api/skills/download/5564" && unzip -o skill.zip -d .claude/skills/smart-model-switching-glm && rm skill.zipInstalls to .claude/skills/smart-model-switching-glm
About this skill
Smart Model Switching
Three-tier z.ai (GLM) routing: Flash → Standard → Plus / 32B
Start with the cheapest model. Escalate only when needed. Designed to minimize API cost without sacrificing correctness.
The Golden Rule
If a human would need more than 30 seconds of focused thinking, escalate from Flash to Standard.
If the task involves architecture, complex tradeoffs, or deep reasoning, escalate to Plus / 32B.
Model Reality (Relative)
| Tier | Example Models | Purpose |
|---|---|---|
| Flash | GLM-4.5-Flash, GLM-4.7-Flash | Fastest & cheapest |
| Standard | GLM-4.6, GLM-4.7 | Strong reasoning & code |
| Plus / 32B | GLM-4-Plus, GLM-4-32B-128K | Heavy reasoning & architecture |
Bottom line: Wrong model selection wastes money OR time. Flash for simple, Standard for normal work, Plus/32B for complex decisions.
💚 FLASH — Default for Simple Tasks
Stay on Flash for:
- Factual Q&A — “what is X”, “who is Y”, “when did Z”
- Quick lookups — definitions, unit conversions, short translations
- Status checks — monitoring, file reads, session state
- Heartbeats — periodic checks, OK responses
- Memory & reminders
- Casual conversation — greetings, acknowledgments
- Simple file ops — read, list, basic writes
- One-liner tasks — anything answerable in 1–2 sentences
- Cron jobs (always Flash by default)
NEVER do these on Flash
- ❌ Write code longer than 10 lines
- ❌ Create comparison tables
- ❌ Write more than 3 paragraphs
- ❌ Do multi-step analysis
- ❌ Write reports or proposals
💛 STANDARD — Core Workhorse
Escalate to Standard for:
Code & Technical
- Code generation — functions, scripts, features
- Debugging — normal bug investigation
- Code review — PRs, refactors
- Documentation — README, comments, guides
Analysis & Planning
- Comparisons and evaluations
- Planning — roadmaps, task breakdowns
- Research synthesis
- Multi-step reasoning
Writing & Content
- Long-form writing (>3 paragraphs)
- Summaries of long documents
- Structured output — tables, outlines
Most real user conversations belong here.
❤️ PLUS / 32B — Complex Reasoning Only
Escalate to Plus / 32B for:
Architecture & Design
- System and service architecture
- Database schema design
- Distributed or multi-tenant systems
- Major refactors across multiple files
Deep Analysis
- Complex debugging (race conditions, subtle bugs)
- Security reviews
- Performance optimization strategy
- Root cause analysis
Strategic & Judgment-Based Work
- Strategic planning
- Nuanced judgment and ambiguity
- Deep or multi-source research
- Critical production decisions
🔄 Implementation
For Subagents
// Routine monitoring
sessions_spawn(task="Check backup status", model="GLM-4.5-Flash")
// Standard code work
sessions_spawn(task="Build the REST API endpoint", model="GLM-4.7")
// Architecture decisions
sessions_spawn(task="Design the database schema for multi-tenancy", model="GLM-4-Plus")
For Cron Jobs
json
Copy code
{
"payload": {
"kind": "agentTurn",
"model": "GLM-4.5-Flash"
}
}
Always use Flash for cron unless the task genuinely needs reasoning.
📊 Quick Decision Tree
pgsql
Copy code
Is it a greeting, lookup, status check, or 1–2 sentence answer?
YES → FLASH
NO ↓
Is it code, analysis, planning, writing, or multi-step?
YES → STANDARD
NO ↓
Is it architecture, deep reasoning, or a critical decision?
YES → PLUS / 32B
NO → Default to STANDARD, escalate if struggling
📋 Quick Reference Card
less
Copy code
┌─────────────────────────────────────────────────────────────┐
│ SMART MODEL SWITCHING │
│ Flash → Standard → Plus / 32B │
├─────────────────────────────────────────────────────────────┤
│ 💚 FLASH (cheapest) │
│ • Greetings, status checks, quick lookups │
│ • Factual Q&A, reminders │
│ • Simple file ops, 1–2 sentence answers │
├─────────────────────────────────────────────────────────────┤
│ 💛 STANDARD (workhorse) │
│ • Code > 10 lines, debugging │
│ • Analysis, comparisons, planning │
│ • Reports, long writing │
├─────────────────────────────────────────────────────────────┤
│ ❤️ PLUS / 32B (complex) │
│ • Architecture decisions │
│ • Complex debugging, multi-file refactoring │
│ • Strategic planning, deep research │
├─────────────────────────────────────────────────────────────┤
│ 💡 RULE: >30 sec human thinking → escalate │
│ 💰 START CHEAP → SCALE ONLY WHEN NEEDED │
└─────────────────────────────────────────────────────────────┘
Built for z.ai (GLM) setups.
More by openclaw
View all skills by openclaw →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversEnhance software testing with Playwright MCP: Fast, reliable browser automation, an innovative alternative to Selenium s
Boost productivity with Task Master: an AI-powered tool for project management and agile development workflows, integrat
Mobile Next offers fast, seamless mobile automation for iOS and Android. Automate apps, extract data, and simplify mobil
Connect Supabase projects to AI with Supabase MCP Server. Standardize LLM communication for secure, efficient developmen
Async browser automation server using GPT-4o for remote web navigation, extraction, and tasks. Ideal for Selenium softwa
Boost productivity with AI for project management. monday.com MCP securely automates workflows and data. Seamless AI and
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.