kiln-add-model

0
0
Source

Add new AI models to Kiln's ml_model_list.py and produce a Discord announcement. Use when the user wants to add, integrate, or register a new LLM model (e.g. Claude, GPT, DeepSeek, Gemini, Kimi, Qwen, Grok) into the Kiln model list, or mentions adding a model to ml_model_list.py.

Install

mkdir -p .claude/skills/kiln-add-model && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4551" && unzip -o skill.zip -d .claude/skills/kiln-add-model && rm skill.zip

Installs to .claude/skills/kiln-add-model

About this skill

Add a New AI Model to Kiln

Integrating a new model into libs/core/kiln_ai/adapters/ml_model_list.py requires:

  1. ModelName enum – add an enum member
  2. built_in_models list – add a KilnModel(...) entry with providers
  3. ModelFamily enum – only if the vendor is brand-new

After code changes, run paid integration tests, then draft a Discord post.


Global Rules

These apply throughout the entire workflow.

  • Sandbox: All curl and uv run commands MUST use required_permissions: ["all"]. The sandbox breaks uv run (Rust panics) and blocks network access for curl.
  • Slug verification: NEVER guess or infer model slugs from naming patterns. Every model_id must come from an authoritative source (LiteLLM catalog, official docs, API reference, or changelog). If you can't verify a slug, tell the user and ask them to provide it.
  • Date awareness: These models are often released very recently. Web search for current info before assuming you know the details.

Phase 1 – Model Discovery (only when asked to find new/missing models)

If the user asks you to find new models, do NOT just web search "new AI models this week" — that only surfaces major releases. Instead, systematically check each family against both the LiteLLM catalog and models.dev, then union the results. Both are attempts to catalog available models and each has gaps the other fills.

  1. Read the ModelFamily and ModelName enums to know what we already have.

  2. Query both catalogs for each family (run in parallel where possible):

    LiteLLM catalog — filters out mirror providers to avoid duplicates:

    curl -s 'https://api.litellm.ai/model_catalog?model=SEARCH_TERM&mode=chat&page_size=500' -H 'accept: application/json' | jq '[.data[] | select(.provider != "openrouter" and .provider != "bedrock" and .provider != "bedrock_converse" and .provider != "vertex_ai-anthropic_models" and .provider != "azure") | .id] | unique | .[]'
    

    models.dev — search all model IDs across all providers:

    curl -s https://models.dev/api.json | jq '[to_entries[].value.models // {} | keys[]] | .[]' | grep -i "SEARCH_TERM"
    

    For details on a specific provider+model: curl -s https://models.dev/api.json | jq '.["PROVIDER"].models["MODEL_ID"]'

  3. Search terms (one query per term): claude, gpt, o1, o3, o4 (OpenAI reasoning), gemini, llama, deepseek, qwen, qwq, mistral, grok, kimi, glm, minimax, hunyuan, ernie, phi, gemma, seed, step, pangu

  4. Union and cross-reference results from both catalogs against ModelName. A model found in either source counts as available. Focus on direct-provider entries (not OpenRouter/Bedrock/Azure mirrors). Skip pure coding models (e.g. codestral, deepseek-coder, qwen-coder).

  5. Run targeted web searches per family to catch very fresh releases not yet in either catalog:

    • "[family] new model [current year]"
    • "[family] release [current month] [current year]"
  6. Present findings as a summary. Let the user decide which to add.


Phase 2 – Gather Context

  1. Read the predecessor model in ml_model_list.py (e.g. for Opus 4.6 → read Opus 4.5). You inherit most parameters from it.

  2. Query the LiteLLM catalog for the new model. This is the primary slug source since Kiln uses LiteLLM. See the Slug Lookup Reference for query syntax and all verified sources.

  3. Get the OpenRouter slug via:

    • curl -s https://openrouter.ai/api/v1/models | jq '.data[].id' | grep -i "SEARCH_TERM"
    • Fallback: WebSearch for openrouter [model name] model id
  4. Get the direct-provider slug (Anthropic, OpenAI, Google, etc.). Use the LiteLLM catalog first, then official docs. See the Slug Lookup Reference for provider-specific URLs.

  5. Identify quirks — check the Provider Quirks Reference for the relevant provider, and web search for any new quirks:

    • Structured output mode (JSON schema vs function calling)?
    • Reasoning model (needs reasoning_capable, parsers, OpenRouter options)?
    • Vision/multimodal support? Which MIME types?
    • Provider-specific flags (temp_top_p_exclusive, etc.)?
    • Rate limit concerns (max_parallel_requests)?
  6. Determine thinking levels — does the model support configurable reasoning effort? See Thinking Levels Reference for the full lookup chain. Key quick checks:

    • Check the vendor model page (e.g. OpenAI model pages say "Reasoning.effort supports: X, Y, Z")
    • Check OpenRouter supported_parameters — if reasoning is absent, skip thinking levels
    • R1-style thinking models (DeepSeek, Qwen thinking variants) do NOT get thinking level dicts

Phase 3 – Code Changes

All changes go in libs/core/kiln_ai/adapters/ml_model_list.py.

3a. ModelName enum

  • snake_case: claude_opus_4_6 = "claude_opus_4_6"
  • Place before predecessor (newer first within group)
  • Follow existing grouping (all claude together, all gpt together, etc.)

3b. KilnModel entry in built_in_models

  • Place before predecessor entry (newer = higher in list)
  • Copy predecessor's structure and modify: name, friendly_name, model_id per provider, flags
  • friendly_name must follow the existing naming pattern of sibling models in the same family. Check the predecessor. For example, Claude Sonnets use "Claude {version} Sonnet" (e.g. "Claude 4.5 Sonnet"), not "Claude Sonnet {version}". Do NOT use the vendor's marketing name if it differs from Kiln's established convention.

Provider model_id formats:

ProviderFormatNotes
openroutervendor/model-nameAlways verify via API
openaiBare model nameVerify via OpenAI docs
anthropicVariable — older models have date stamps, newer may notAlways verify via Anthropic docs
gemini_apiBare nameVerify via Google AI Studio docs
fireworks_aiaccounts/fireworks/models/...Verify via Fireworks docs
together_aiVendor path formatVerify via Together docs
vertexUsually same as gemini_apiVerify via Vertex docs
siliconflow_cnVendor/model formatVerify via SiliconFlow docs

Every single model_id must be verified from an authoritative source. No exceptions.

Setting flags — use catalog data + predecessor as dual signals:

The LiteLLM catalog and models.dev responses include capability flags (supports_vision, supports_function_calling, supports_reasoning, etc.). Use these as the primary signal for what to enable on the new model:

  • If the catalog says supports_vision: true → enable supports_vision, multimodal_capable, and vision MIME types (see 2c)
  • If the catalog says supports_function_calling: true → use StructuredOutputMode.json_schema (or function_calling depending on provider norms — check predecessor)
  • If the catalog says supports_reasoning: true → enable reasoning_capable and check if parser/formatter/thinking flags are needed

Then cross-check against the predecessor. The predecessor tells you how Kiln configures a similar model (which structured_output_mode, which provider-specific flags, etc.). The catalog tells you what the model can do. Use both:

  • Catalog says the model supports vision but predecessor doesn't have it? Enable it — this is a new capability.
  • Predecessor has temp_top_p_exclusive but nothing in the catalog mentions it? Keep it — it's a provider quirk the catalog doesn't track.
  • Catalog and predecessor disagree on something? Trust the catalog for capabilities, trust the predecessor for Kiln-specific configuration patterns.

Common flags:

  • structured_output_mode – how the model handles JSON output
  • suggested_for_evals / suggested_for_data_gen – see zero-sum rule below
  • multimodal_capable / supports_vision / supports_doc_extraction – see multimodal rules below
  • reasoning_capable – for thinking/reasoning models
  • temp_top_p_exclusive – Anthropic models that can't have both temp and top_p
  • parser / formatter – for models needing special parsing (e.g. R1-style thinking)

2c. Multimodal capabilities

If the model supports non-text inputs, configure:

  • multimodal_capable=True and supports_doc_extraction=True if it supports any MIME types
  • supports_vision=True if it supports images
  • multimodal_requires_pdf_as_image=True if vision-capable but no native PDF support (also add KilnMimeType.PDF to MIME list). Always set this on OpenRouter providers — OpenRouter routes PDFs through Mistral OCR which breaks LiteLLM parsing.
  • Always include KilnMimeType.TXT and KilnMimeType.MD on any multimodal_capable model

Strategy: start broad, narrow based on test failures. Enable a generous set of MIME types, run tests, and remove only types the provider explicitly rejects (400 errors). Don't remove types for timeout/auth/content-mismatch failures.

Full MIME superset (Gemini uses all):

# documents
KilnMimeType.PDF, KilnMimeType.CSV, KilnMimeType.TXT, KilnMimeType.HTML, KilnMimeType.MD
# images
KilnMimeType.JPG, KilnMimeType.PNG
# audio
KilnMimeType.MP3, KilnMimeType.WAV, KilnMimeType.OGG
# video
KilnMimeType.MP4, KilnMimeType.MOV

3d. suggested_for_evals / suggested_for_data_gen

Only set these if the predecessor already has them, OR web search shows the model is a clear SOTA leap (ask user to confirm first).

Zero-sum rule: When adding a new model with these flags, remove them from the oldest same-family model to keep the suggested count stable. Ask the user to confirm the swap before making changes.

3e. ModelFamily enum (only if needed)

Only add a new family if the vendor is completely new.

3f. Thinking Levels (`available_thinking_l


Content truncated.

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

643969

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

591705

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318398

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

339397

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

451339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.