Updated April 2026Cookbook19 min read

Claude ComfyUI workflow-builder skill: 10 node graphs in one prompt

Ten real ComfyUI workflows — basic txt2img, img2img with denoise, ControlNet depth, SDXL two-pass refiner, three-LoRA stack, masked inpainting, 4x upscale, AnimateDiff loop, IPAdapter face+style combo, batched API queue — each as a single Claude prompt that emits API-format JSON ready to POST to your local server on port 8188.

Already know what skills are? Skip to the cookbook. First time? Read the explainer then come back. Need the install? It’s on the /skills/comfyui-workflow-builder page.

Editorial illustration: a luminous teal node-graph network of four connected geometric nodes linked by glowing curved cables, with a single image-frame rectangle showing a mountain-and-sun render to its right, on a midnight navy background — a ComfyUI workflow turning into a generated image.
On this page · 21 sections
  1. What this skill does
  2. The cookbook
  3. Install + README
  4. Watch it built
  5. 01 · Basic txt2img: the five-node hello world
  6. 02 · img2img with denoise control
  7. 03 · ControlNet pose with a depth or canny preprocessor
  8. 04 · SDXL base + refiner two-pass workflow
  9. 05 · LoRA stack with three weights tuned
  10. 06 · Inpainting with a mask brush
  11. 07 · Upscale via 4x-UltraSharp + tile pass
  12. 08 · AnimateDiff motion module — 16-frame loop
  13. 09 · IPAdapter face transfer + style transfer combo
  14. 10 · Batch render N variants from a prompt list
  15. Community signal
  16. The contrarian take
  17. Real workflows shipped
  18. Gotchas
  19. Pairs well with
  20. FAQ
  21. Sources

What this skill actually does

Sixty seconds of context before the cookbook — what the comfyui-workflow-builder skill is, the exact JSON shape Claude returns, and the one thing it does NOT do for you.

What this skill actually does

Generates optimized ComfyUI workflows for image generation, editing, and enhancement.

Snoopiam · comfyui-workflow-builder SKILL.md · /skills/comfyui-workflow-builder

What Claude returns

Returns ComfyUI workflows in the API JSON format — a flat object keyed by node ID, where each node has class_type (CheckpointLoaderSimple, CLIPTextEncode, KSampler, VAEDecode, SaveImage, ControlNetLoader, ControlNetApplyAdvanced, LoraLoader, LoadImage, EmptyLatentImage, IPAdapter, AnimateDiffLoaderGen1, UltimateSDUpscale) and inputs that connect to other nodes via ["node_id", output_index] tuples. The output is POSTable to http://127.0.0.1:8188/prompt, drag-droppable onto the ComfyUI canvas, and produces PNG outputs with the workflow embedded in metadata.

What it does NOT do

It does not install ComfyUI, run the server, download checkpoints, or install custom-node packs — you still need a running ComfyUI instance, the .safetensors files in models/, and ComfyUI-Manager for any custom nodes the workflow references.

How you trigger it

build me a ComfyUI workflow for txt2img with SDXL …generate a ControlNet depth workflow that …give me an AnimateDiff workflow.json that loops 16 frames at …

Cost when idle

Roughly 110 tokens for the skill metadata stay loaded per turn. The full SKILL.md and node-reference catalog only load when Claude actually drafts a workflow, so day-to-day chat cost is unchanged.

One format note. ComfyUI ships two JSON shapes for the same graph — the workflow JSON the GUI saves (with positions and widget values for the canvas) and the API JSON the /prompt endpoint executes (a flat object keyed by node ID with class_type and inputs). The skill emits API JSON because that’s what scripts queue and what round-trips cleanly. To go GUI-side from the API output, drag the rendered PNG onto the ComfyUI canvas — every output PNG carries its workflow in metadata, which is the whole reason ComfyUI’s drag-PNG-to-load feature became famous.

The cookbook

Each entry below is one workflow you could ship today. They run in the order I’d teach them — the first three lean on built-in nodes (CheckpointLoaderSimple, KSampler, VAEDecode, SaveImage), the middle four use ControlNet, refiner chains, and LoRA stacks, and the last three reach for custom-node packs (AnimateDiff-Evolved, IPAdapter_plus, UltimateSDUpscale) the community has made canonical. Every entry pairs with one or two skills or MCP servers from mcp.directory.

Install + README

If the skill isn’t on your machine yet, here’s the one-liner. The full install panel (Codex, Copilot, Antigravity variants) is on the skill page. You also need a running ComfyUI server on http://127.0.0.1:8188 install instructions are on the upstream repo, and ComfyUI-Manager handles every custom-node pack the cookbook references.

One-line install · by Snoopiam

Open skill page

Install

mkdir -p .claude/skills/comfyui-workflow-builder && curl -L -o skill.zip "https://mcp.directory/api/skills/download/416" && unzip -o skill.zip -d .claude/skills/comfyui-workflow-builder && rm skill.zip

Installs to .claude/skills/comfyui-workflow-builder

Watch it built

A clean walkthrough of the default ComfyUI graph — what each of the seven hello-world nodes does and why the node topology beats Automatic1111’s tabs once you’ve seen the data flow once. Useful before the cookbook because it anchors the visual layout of what Claude is emitting as JSON.

01

Basic txt2img: the five-node hello world

The minimum viable ComfyUI workflow — checkpoint loader, positive and negative CLIPTextEncode, KSampler, VAEDecode, SaveImage. The graph every other workflow extends.

ForFirst-time ComfyUI users who installed last night and want a workflow.json that runs.

The prompt

Generate a ComfyUI API-format workflow.json for a basic SD 1.5 txt2img run. Use CheckpointLoaderSimple with v1-5-pruned-emaonly.safetensors, positive prompt 'golden retriever puppy in a meadow at sunrise, photorealistic, 8k', negative 'blurry, lowres, watermark, text', 512x512, KSampler with euler/normal/20 steps/cfg 7.5/seed 42, VAEDecode, SaveImage with prefix 'puppy'. Drop into ~/ComfyUI/user/default/workflows/ and queue via the API.

What slides.md looks like

{
  "3": { "class_type": "KSampler", "inputs": { "seed": 42, "steps": 20, "cfg": 7.5, "sampler_name": "euler", "scheduler": "normal", "denoise": 1, "model": ["4", 0], "positive": ["6", 0], "negative": ["7", 0], "latent_image": ["5", 0] } },
  "4": { "class_type": "CheckpointLoaderSimple", "inputs": { "ckpt_name": "v1-5-pruned-emaonly.safetensors" } },
  "5": { "class_type": "EmptyLatentImage", "inputs": { "width": 512, "height": 512, "batch_size": 1 } },
  "6": { "class_type": "CLIPTextEncode", "inputs": { "clip": ["4", 1], "text": "golden retriever puppy in a meadow at sunrise, photorealistic, 8k" } },
  "7": { "class_type": "CLIPTextEncode", "inputs": { "clip": ["4", 1], "text": "blurry, lowres, watermark, text" } },
  "8": { "class_type": "VAEDecode", "inputs": { "samples": ["3", 0], "vae": ["4", 2] } },
  "9": { "class_type": "SaveImage", "inputs": { "filename_prefix": "puppy", "images": ["8", 0] } }
}
# POST to http://127.0.0.1:8188/prompt with body {"prompt": <above>}
# → { "prompt_id": "...", "number": 1, "node_errors": {} }

One-line tweak

Bump steps to 30 and switch sampler_name to dpmpp_2m for cleaner small-detail rendering — same graph, four-character edit.

02

img2img with denoise control

Same graph as #1 but swap EmptyLatentImage for a LoadImage + VAEEncode chain so KSampler starts from an existing image — denoise=0.55 keeps composition, =0.85 reinvents it.

ForAnyone iterating on a rough sketch, a screenshot, or a previous render they want to nudge.

The prompt

Modify the basic txt2img workflow into img2img. Replace EmptyLatentImage (node 5) with LoadImage reading 'sketch.png' from the input/ folder, plus a VAEEncode node feeding KSampler.latent_image. Set KSampler.denoise to 0.6 (keep composition, repaint surface). Keep the rest. Output as workflow_img2img.json.

What slides.md looks like

{
  "3": { "class_type": "KSampler", "inputs": { "seed": 42, "steps": 20, "cfg": 7.5, "sampler_name": "euler", "scheduler": "normal", "denoise": 0.6, "model": ["4", 0], "positive": ["6", 0], "negative": ["7", 0], "latent_image": ["11", 0] } },
  "10": { "class_type": "LoadImage", "inputs": { "image": "sketch.png" } },
  "11": { "class_type": "VAEEncode", "inputs": { "pixels": ["10", 0], "vae": ["4", 2] } },
  "4":  { "class_type": "CheckpointLoaderSimple", "inputs": { "ckpt_name": "v1-5-pruned-emaonly.safetensors" } },
  "6":  { "class_type": "CLIPTextEncode", "inputs": { "clip": ["4", 1], "text": "oil painting of a stone bridge, 19th century" } }
  /* + nodes 7, 8, 9 unchanged from #1 */
}

One-line tweak

If denoise=0.6 still erases your sketch's lines, drop to 0.4 and bump steps to 30 — fewer surface changes, more refinement.

03

ControlNet pose with a depth or canny preprocessor

Lock the pose of an input photo into the new render. Adds ControlNetLoader + a preprocessor (DepthAnything or Canny) and a ControlNetApply node between the conditioning and KSampler.

ForDesigners blocking poses for character art, or anyone replacing a stock photo while keeping the silhouette.

The prompt

Build an SD 1.5 ControlNet workflow with depth preprocessing. Use LoadImage('reference.jpg'), DepthAnythingPreprocessor (or MiDaS) → ControlNetApplyAdvanced(strength=0.85) on the positive conditioning. ControlNetLoader = control_v11f1p_sd15_depth.pth. Positive prompt 'a knight in glowing armor, cinematic, volumetric light'. Save as workflow_controlnet_depth.json.

What slides.md looks like

{
  "12": { "class_type": "ControlNetLoader", "inputs": { "control_net_name": "control_v11f1p_sd15_depth.pth" } },
  "13": { "class_type": "LoadImage", "inputs": { "image": "reference.jpg" } },
  "14": { "class_type": "DepthAnythingPreprocessor", "inputs": { "image": ["13", 0], "resolution": 512 } },
  "15": { "class_type": "ControlNetApplyAdvanced", "inputs": { "positive": ["6", 0], "negative": ["7", 0], "control_net": ["12", 0], "image": ["14", 0], "strength": 0.85, "start_percent": 0, "end_percent": 1 } },
  "3":  { "class_type": "KSampler", "inputs": { "positive": ["15", 0], "negative": ["15", 1], "latent_image": ["5", 0], "model": ["4", 0], "seed": 42, "steps": 20, "cfg": 7.5, "sampler_name": "euler", "scheduler": "normal", "denoise": 1 } }
}
# Requires: comfyui_controlnet_aux custom node pack (DepthAnythingPreprocessor)

One-line tweak

Swap DepthAnything for CannyEdgePreprocessor when the input is a line drawing — depth bakes in volume, canny respects strokes.

04

SDXL base + refiner two-pass workflow

Run the SDXL base model for the first 80% of denoising, then hand the latent to the SDXL refiner for the last 20% — the official two-stage pattern from the SDXL paper.

ForSDXL users who want sharper hands, faces, and small text without leaving ComfyUI.

The prompt

Generate a two-pass SDXL workflow. CheckpointLoaderSimple(sd_xl_base_1.0.safetensors) → CLIPTextEncodeSDXL(positive,negative) → KSamplerAdvanced(steps=25, end_at_step=20, return_with_leftover_noise=enable). Pipe leftover latent into a second KSamplerAdvanced(start_at_step=20) using sd_xl_refiner_1.0.safetensors. 1024x1024, dpmpp_2m_sde / karras. Save as workflow_sdxl_two_pass.json.

What slides.md looks like

{
  "20": { "class_type": "CheckpointLoaderSimple", "inputs": { "ckpt_name": "sd_xl_base_1.0.safetensors" } },
  "21": { "class_type": "CheckpointLoaderSimple", "inputs": { "ckpt_name": "sd_xl_refiner_1.0.safetensors" } },
  "22": { "class_type": "EmptyLatentImage", "inputs": { "width": 1024, "height": 1024, "batch_size": 1 } },
  "23": { "class_type": "KSamplerAdvanced", "inputs": { "add_noise": "enable", "noise_seed": 0, "steps": 25, "cfg": 7.5, "sampler_name": "dpmpp_2m_sde", "scheduler": "karras", "start_at_step": 0, "end_at_step": 20, "return_with_leftover_noise": "enable", "model": ["20", 0], "positive": ["6", 0], "negative": ["7", 0], "latent_image": ["22", 0] } },
  "24": { "class_type": "KSamplerAdvanced", "inputs": { "add_noise": "disable", "noise_seed": 0, "steps": 25, "cfg": 7.5, "sampler_name": "dpmpp_2m_sde", "scheduler": "karras", "start_at_step": 20, "end_at_step": 25, "return_with_leftover_noise": "disable", "model": ["21", 0], "positive": ["6", 0], "negative": ["7", 0], "latent_image": ["23", 0] } }
}

One-line tweak

If hands still drift, drop refiner start_at_step from 20 to 18 — give the refiner a tiny bit more denoising headroom.

05

LoRA stack with three weights tuned

Chain three LoraLoader nodes between the checkpoint and KSampler — a character LoRA at 0.8, a style LoRA at 0.5, a detail LoRA at 0.3 — so each contributes without one swallowing the others.

ForAnyone with a folder of safetensors LoRAs they want to mix without retraining.

The prompt

Build a LoRA stack workflow. After CheckpointLoaderSimple, chain LoraLoader('character_v3.safetensors', strength_model=0.8, strength_clip=0.8) → LoraLoader('cinematic_style.safetensors', 0.5, 0.5) → LoraLoader('skin_detail.safetensors', 0.3, 0.3) → CLIPTextEncode(positive). Use SDXL base, 1024x1024, dpmpp_2m_sde, 30 steps. Output as workflow_lora_stack.json.

What slides.md looks like

{
  "4":  { "class_type": "CheckpointLoaderSimple", "inputs": { "ckpt_name": "sd_xl_base_1.0.safetensors" } },
  "30": { "class_type": "LoraLoader", "inputs": { "lora_name": "character_v3.safetensors", "strength_model": 0.8, "strength_clip": 0.8, "model": ["4", 0], "clip": ["4", 1] } },
  "31": { "class_type": "LoraLoader", "inputs": { "lora_name": "cinematic_style.safetensors", "strength_model": 0.5, "strength_clip": 0.5, "model": ["30", 0], "clip": ["30", 1] } },
  "32": { "class_type": "LoraLoader", "inputs": { "lora_name": "skin_detail.safetensors", "strength_model": 0.3, "strength_clip": 0.3, "model": ["31", 0], "clip": ["31", 1] } },
  "6":  { "class_type": "CLIPTextEncode", "inputs": { "clip": ["32", 1], "text": "<lora:character_v3:0.8> <lora:cinematic_style:0.5> portrait of the character" } }
}

One-line tweak

Sum of strength_model values >1.5 cooks the model — if outputs go melty, scale every weight by 0.7 and re-check.

06

Inpainting with a mask brush

Repaint just one region of an image — the SDXL Inpainting checkpoint plus a LoadImage that returns both pixels and a mask. Used for replacing backgrounds, fixing faces, removing objects.

ForPhoto retouchers and product-shot editors who want to change one thing without re-rendering the whole frame.

The prompt

Build an SDXL inpainting workflow. CheckpointLoaderSimple('sd_xl_inpainting_0.1.safetensors'). LoadImage('source.png') returns image+mask (the alpha channel becomes the mask). VAEEncodeForInpaint with grow_mask_by=6. KSampler at denoise=1 inside the masked region only. Positive prompt: 'a small wooden table'. Save as workflow_inpaint.json.

What slides.md looks like

{
  "40": { "class_type": "LoadImage", "inputs": { "image": "source.png" } },
  "41": { "class_type": "CheckpointLoaderSimple", "inputs": { "ckpt_name": "sd_xl_inpainting_0.1.safetensors" } },
  "42": { "class_type": "VAEEncodeForInpaint", "inputs": { "pixels": ["40", 0], "vae": ["41", 2], "mask": ["40", 1], "grow_mask_by": 6 } },
  "3":  { "class_type": "KSampler", "inputs": { "seed": 42, "steps": 25, "cfg": 7.5, "sampler_name": "dpmpp_2m", "scheduler": "karras", "denoise": 1, "model": ["41", 0], "positive": ["6", 0], "negative": ["7", 0], "latent_image": ["42", 0] } }
}
# Mask comes from the alpha channel of source.png (paint it transparent in any editor)

One-line tweak

If the inpainted region has visible seams, raise grow_mask_by from 6 to 12 — bigger feather, smoother blend.

07

Upscale via 4x-UltraSharp + tile pass

Two-stage upscale: ESRGAN-family UpscaleModelLoader (4x_NMKD-Siax_200k or 4x-UltraSharp) for the resolution bump, then a tiled KSampler diffusion pass to add detail without seams.

ForAnyone whose 1024x1024 output needs to land on a 4K poster or a retina hero image.

The prompt

Build a two-stage upscale workflow. UpscaleModelLoader('4x-UltraSharp.pth') → ImageUpscaleWithModel for 4x. Then UltimateSDUpscale (custom node) at tile_size=1024, denoise=0.25, steps=15, dpmpp_2m_sde, with the same SDXL checkpoint and the original prompt for tile-pass conditioning. Save as workflow_upscale_4k.json.

What slides.md looks like

{
  "50": { "class_type": "UpscaleModelLoader", "inputs": { "model_name": "4x-UltraSharp.pth" } },
  "51": { "class_type": "LoadImage", "inputs": { "image": "render_1024.png" } },
  "52": { "class_type": "ImageUpscaleWithModel", "inputs": { "upscale_model": ["50", 0], "image": ["51", 0] } },
  "53": { "class_type": "UltimateSDUpscale", "inputs": { "image": ["52", 0], "model": ["4", 0], "positive": ["6", 0], "negative": ["7", 0], "vae": ["4", 2], "upscale_by": 1, "seed": 42, "steps": 15, "cfg": 7, "sampler_name": "dpmpp_2m_sde", "scheduler": "karras", "denoise": 0.25, "mode_type": "Linear", "tile_width": 1024, "tile_height": 1024, "mask_blur": 8, "tile_padding": 32 } }
}
# Requires: ComfyUI_UltimateSDUpscale custom node

One-line tweak

Drop denoise from 0.25 to 0.18 if tile seams reappear — less per-tile reinvention, more pure upscale.

08

AnimateDiff motion module — 16-frame loop

Wrap a base SD 1.5 graph with the AnimateDiff motion module to produce a 16-frame, 8-fps looped animation. Adds AnimateDiffLoaderGen1 + a VHS_VideoCombine output node.

ForAnyone shipping animated banners, looping social posts, or sketch animatics.

The prompt

Build an AnimateDiff txt2video workflow. CheckpointLoaderSimple('photonV1.safetensors'). AnimateDiffLoaderGen1(model_name='v3_sd15_mm.ckpt', beta_schedule='sqrt_linear (AnimateDiff)'). EmptyLatentImage with batch_size=16. KSampler steps=20, dpmpp_2m, denoise=1. VHS_VideoCombine at 8fps, output mp4, prefix 'loop'. Save as workflow_animatediff.json.

What slides.md looks like

{
  "60": { "class_type": "AnimateDiffLoaderGen1", "inputs": { "model_name": "v3_sd15_mm.ckpt", "beta_schedule": "sqrt_linear (AnimateDiff)", "model": ["4", 0] } },
  "61": { "class_type": "EmptyLatentImage", "inputs": { "width": 512, "height": 768, "batch_size": 16 } },
  "62": { "class_type": "KSampler", "inputs": { "seed": 42, "steps": 20, "cfg": 7.5, "sampler_name": "dpmpp_2m", "scheduler": "karras", "denoise": 1, "model": ["60", 0], "positive": ["6", 0], "negative": ["7", 0], "latent_image": ["61", 0] } },
  "63": { "class_type": "VHS_VideoCombine", "inputs": { "images": ["8", 0], "frame_rate": 8, "loop_count": 0, "filename_prefix": "loop", "format": "video/h264-mp4", "pingpong": false } }
}
# Requires: ComfyUI-AnimateDiff-Evolved + ComfyUI-VideoHelperSuite

One-line tweak

Set pingpong=true for a back-and-forth loop with no jump cut — perfect for product hero loops.

09

IPAdapter face transfer + style transfer combo

Two IPAdapter nodes — one carrying a face reference, one carrying a style reference — both feeding the same KSampler conditioning. Same person, new aesthetic, in one pass.

ForBrand designers, character illustrators, and anyone shipping a consistent persona across many backgrounds.

The prompt

Build an IPAdapter face+style workflow. IPAdapterUnifiedLoader('PLUS (high strength)') with the SDXL base. Two parallel IPAdapter nodes: face image='face_ref.png' weight=0.85 weight_type='style transfer'; style image='style_ref.png' weight=0.6 weight_type='style transfer'. Both pipe into the model side of KSampler. Save as workflow_ipadapter_combo.json.

What slides.md looks like

{
  "70": { "class_type": "IPAdapterUnifiedLoader", "inputs": { "model": ["4", 0], "preset": "PLUS (high strength)" } },
  "71": { "class_type": "LoadImage", "inputs": { "image": "face_ref.png" } },
  "72": { "class_type": "LoadImage", "inputs": { "image": "style_ref.png" } },
  "73": { "class_type": "IPAdapter", "inputs": { "model": ["70", 0], "ipadapter": ["70", 1], "image": ["71", 0], "weight": 0.85, "weight_type": "linear", "start_at": 0, "end_at": 1 } },
  "74": { "class_type": "IPAdapter", "inputs": { "model": ["73", 0], "ipadapter": ["70", 1], "image": ["72", 0], "weight": 0.6, "weight_type": "style transfer", "start_at": 0, "end_at": 1 } },
  "3":  { "class_type": "KSampler", "inputs": { "model": ["74", 0], "positive": ["6", 0], "negative": ["7", 0], "latent_image": ["5", 0], "seed": 42, "steps": 25, "cfg": 6.5, "sampler_name": "dpmpp_2m", "scheduler": "karras", "denoise": 1 } }
}
# Requires: ComfyUI_IPAdapter_plus by cubiq

One-line tweak

If face fidelity slips, raise the face IPAdapter weight to 1.0 and drop style to 0.4 — face is the harder lock.

10

Batch render N variants from a prompt list

Loop one workflow over a list of prompts (or seeds, or LoRA weights) so a single queue produces a labeled grid. Uses the API queue endpoint and a small wrapper script.

ForAnyone running an A/B-style render comparison — palette tests, seed sweeps, prompt ablations.

The prompt

Wrap the basic txt2img workflow in a Python loop that POSTs to /prompt for each entry in PROMPTS = ['a fox in a forest', 'a fox in a city', 'a fox in a desert', 'a fox in snow']. For each, change CLIPTextEncode.text and SaveImage.filename_prefix, post to http://127.0.0.1:8188/prompt, and write the prompt_id to a CSV. Save as run_batch.py.

What slides.md looks like

import json, urllib.request, csv
WORKFLOW = json.load(open('workflow.json'))  # API format
PROMPTS = ['a fox in a forest', 'a fox in a city', 'a fox in a desert', 'a fox in snow']
rows = []
for i, prompt in enumerate(PROMPTS):
    WORKFLOW['6']['inputs']['text'] = prompt
    WORKFLOW['9']['inputs']['filename_prefix'] = f'batch_{i:02d}'
    body = json.dumps({'prompt': WORKFLOW}).encode()
    req  = urllib.request.Request('http://127.0.0.1:8188/prompt', data=body, headers={'Content-Type': 'application/json'})
    pid  = json.loads(urllib.request.urlopen(req).read())['prompt_id']
    rows.append([i, prompt, pid])
csv.writer(open('batch.csv', 'w')).writerows(rows)
# → 4 jobs queued, prompt_ids logged

One-line tweak

Vary `KSampler.seed` instead of `text` to get a four-up seed grid for the same prompt — useful for picking a hero render.

Community signal

Three voices from people running ComfyUI as their daily pipeline. The first explains why the embedded-PNG-workflow feature has no real equivalent in Automatic1111; the second is the simplest one-line endorsement on launch threads; the third is the everyday ComfyUI-Manager story that resolves the “workflow loaded but seven nodes are red” problem.

A1111 by nature, has a bunch of disconnected operations in separate tabs and scripts. … Even if the PNG captures all of a generation operation that would be executed by a single launch-button click, its not really equivalent to capturing a whole ComfyUI workflow, which can be the equivalent of a process which would be numerous different tasks in A1111.

dragonwriter · Hacker News

Comment on the SDXL Turbo + ComfyUI HN thread, explaining why the embedded-PNG-workflow feature has no real equivalent in Automatic1111 — A1111's tabs break the pipeline into manual hand-offs.

Source
I love that they embed entire workflows into the meta of their images.

tetris11 · Hacker News

Top-voted reaction on the SDXL Turbo HN thread. The drag-PNG-to-load behavior — every output PNG is a copy of the workflow that produced it — is the single feature most often called out as ComfyUI's killer move.

Source
Combined with the ComfyUI Manager extensions which provides an index of custom node packages and can install missing ones from a loaded workflow it makes it very easy to get up and running with a new workflow.

dragonwriter · Hacker News

Companion comment from the same thread. ComfyUI-Manager turns the otherwise-painful 'workflow loaded but seven nodes are red' problem into a one-click install of the missing custom-node packs.

Source

The contrarian take

Not everyone is sold on the node graph as the right surface for day-to-day work. The most precise critique I’ve seen on HN is from LucasPi:

Parsing ComfyUI workflows is tricky because of the spaghetti node graph.

LucasPi · Hacker News

From a recent HN comment about parsing ComfyUI workflows.

Source

Honest critique. The spaghetti is the cost of generality — every node is editable, but every node also has to be wired. The cookbook leans on the API JSON format (the structured class_type/inputs shape Claude emits) rather than the visual graph for exactly this reason: structured JSON is parseable, the on-canvas spaghetti is not. For users who genuinely prefer a tabbed UI, two alternatives are still healthy: the original Automatic1111 (now in maintenance, but battle-tested) and Forge (an A1111 fork with better VRAM management). lbeltrame on HN summarized the trade neatly: ComfyUI is daunting on day one, but it’s the best tool for actually understanding what the diffusion pipeline is doing between checkpoint and VAE. That’s also why this cookbook’s use cases get progressively more node-heavy — once the graph stops scaring you, the multi-region ControlNet + IPAdapter combo in use case 9 stops looking crazy.

One more alternative worth naming: a few community ComfyUI MCP servers wrap the running server’s /prompt and /history endpoints as MCP tools. The trade-off is the usual skill-vs-MCP one. The skill is ~110 idle tokens and emits API JSON you POST yourself; an MCP’s tool schemas load every turn but let multiple AI clients (Claude Code, Cursor, an internal agent) share the same GPU. Pick the MCP only when that’s actually true for your team — otherwise stay with the comfyui-workflow-builder skill in this cookbook.

Real workflows shipped with ComfyUI

Concrete examples from the upstream and community ecosystem. None of these used the Claude skill specifically — they’re here so you have a target shape in mind when you ask Claude for a workflow.

Gotchas (the four that bite)

Sourced from the SDXL Turbo + ComfyUI HN thread and the ComfyUI README.

API JSON and workflow JSON are not interchangeable

Save (API Format) emits the flat class_type/inputs shape the /prompt endpoint accepts. The default Save emits the visual graph format the GUI loads. The skill emits API JSON. If you double-click an API JSON in the GUI it will fail — drag the embedded PNG instead.

Custom nodes break the moment you load someone else's workflow

ControlNet preprocessors, IPAdapter, AnimateDiff-Evolved, UltimateSDUpscale all live outside core ComfyUI. The fix is dragonwriter's HN observation: ComfyUI-Manager's 'Install Missing Custom Nodes' button reads the loaded workflow's node names and offers a one-click install. Install Manager first, before any cookbook entry past use case 2.

VRAM ceilings hit hard at SDXL + LoRA + ControlNet

An SDXL base + refiner + 3 LoRAs + a ControlNet preprocessor runs about 16 GB on a single 1024x1024 batch. If you only have 8–12 GB, drop the refiner pass first, then the third LoRA, then halve the latent. The tile-pass upscale in use case 7 is the cleanest way to recover 4K from a 1024 base.

Seeds are not reproducible across PyTorch versions or GPUs

A seed=42 render on a 4090 and the same seed on a 3070 are not pixel-identical. Same for PyTorch 2.1 vs 2.3. If you need exact reproduction, lock both. If you just need 'close enough,' the prompt + seed + sampler triple is what to record (and ComfyUI bakes all three into the PNG metadata for free).

Pairs well with

Curated to match the cookbook’s actual integrations: the Stable-Diffusion-adjacent skills the cookbook reaches for (stable-diffusion-image-generation, lora-manager-e2e, comfy-cli, comfyui-request, image-upscaling, image-enhancer, nano-banana-pro) plus the Hugging Face / Replicate / Google AI Studio MCP servers that handle model hosting and hosted-GPU fallback.

Two posts that compose well with this cookbook: What are Claude Code skills? covers the underlying mechanism, and the Nano Banana Pro skill guide is the hosted-API counterpart for use cases where you want a single editorial render rather than a local pipeline.

Frequently asked questions

Is there a ComfyUI MCP server I can use instead of the comfyui-workflow-builder skill?

Several community ComfyUI MCP servers exist (search the catalog for 'comfyui') — they wrap the running server's /prompt and /history endpoints as MCP tools. The trade-off is the usual one. The skill costs about 110 idle tokens and emits API JSON that you POST yourself; an MCP loads its tool schemas every turn but lets multiple AI clients share the same ComfyUI instance. Reach for the MCP only when Claude Code, Cursor, and an internal agent all need to enqueue against the same GPU. Otherwise, the comfyui-workflow-builder skill is the cheaper composition for solo workflow design.

What is the difference between the ComfyUI workflow JSON and the API JSON format?

Two formats, one engine. The 'workflow JSON' is the visual graph format the GUI saves — top-level keys nodes, links, groups, with x/y positions and widget values for the canvas. The 'API JSON' is the executable format the /prompt endpoint accepts — a flat object keyed by numeric node ID, where each node has class_type and inputs (with cross-node refs as ["id", out_index] tuples). The skill emits API JSON because that's what scripts queue, what is round-trippable, and what is parseable. To go GUI-side, use Save (API Format) in the menu — the file you get matches what the skill emits.

Why is 'comfyui workflow builder' a high-CTR query and what does the skill actually build?

The query 'comfyui workflow builder' returns 16.67% CTR in our GSC data because users are looking for exactly what this skill is — a way to author the workflow JSON without dragging nodes by hand. Claude reads the skill, asks one or two clarifying questions (which model, which features), and emits a complete API JSON workflow you can POST to the running server or drop into ~/ComfyUI/user/default/workflows/. It will reference the canonical built-in nodes (CheckpointLoaderSimple, CLIPTextEncode, KSampler, VAEDecode, SaveImage, ControlNetLoader, LoraLoader, IPAdapter, AnimateDiffLoaderGen1) and call out any custom-node pack you'll need to install via ComfyUI-Manager.

Does the skill install ComfyUI for me, or download the checkpoints?

No. ComfyUI itself is a `git clone https://github.com/comfyanonymous/ComfyUI` plus a `pip install -r requirements.txt`, then you run `python main.py`. Checkpoints (sd_xl_base_1.0.safetensors, v1-5-pruned-emaonly.safetensors, etc.) live under models/checkpoints/, LoRAs under models/loras/, ControlNets under models/controlnet/. The skill assumes those are in place. For custom nodes (DepthAnythingPreprocessor, IPAdapter, AnimateDiff-Evolved, UltimateSDUpscale), the skill will name the pack and you install via ComfyUI-Manager — the canonical workflow is open Manager, click 'Install Missing Custom Nodes', restart.

Can the skill work with Flux, SD3, and the newer model families?

Yes. ComfyUI is one of the first front-ends to land Flux-Dev, Flux-Schnell, and SD3 support, and the skill knows the corresponding node names — UNETLoader for Flux's separate UNet, DualCLIPLoader for the t5xxl + clip_l text encoder pair, ModelSamplingFlux for sigma scheduling. Tell the skill the family and the variant ('build a Flux-Dev txt2img with t5xxl_fp8 and a 1024x1024 latent') and it will emit the right loader chain. The cookbook above sticks to SD 1.5 and SDXL because that's where the install ecosystem and LoRA library are most mature, but the same shape applies.

Why is the bare-term query 'comfyui' getting impressions on this page?

Google sometimes serves long-tail pages on bare-brand queries. If you typed just 'comfyui' looking for the official tool, the home is github.com/comfyanonymous/ComfyUI and the docs are docs.comfy.org. This page is specifically about the Claude skill that authors workflow JSON — useful once you have ComfyUI running. The 'comfyui skill' (156 imp), 'comfyui workflow builder' (6 imp at 16.67% CTR), and 'comfyui mcp' clusters all route here intentionally.

How does Claude know which custom-node packs my workflow needs?

The skill tags every non-built-in node with the pack name in a comment block at the top of the JSON or in the chat reply ('Requires: ComfyUI_IPAdapter_plus by cubiq', 'Requires: ComfyUI-AnimateDiff-Evolved'). When you load the JSON, ComfyUI-Manager surfaces missing nodes via 'Install Missing Custom Nodes' and offers a one-click install — the canonical answer to dragonwriter's HN observation that Manager 'can install missing ones from a loaded workflow'.

Sources

Primary

Community

Critical and contrarian

Internal

Keep reading