doubleword-batches

0
0
Source

Create and manage batch inference jobs using the Doubleword API (api.doubleword.ai). Use when users want to: (1) Process multiple AI requests in batch mode, (2) Submit JSONL batch files for async inference, (3) Monitor batch job progress and retrieve results, (4) Work with OpenAI-compatible batch endpoints, (5) Handle large-scale inference workloads that don't require immediate responses, (6) Use tool calling or structured outputs in batches, (7) Automatically batch API calls with autobatcher.

Install

mkdir -p .claude/skills/doubleword-batches && curl -L -o skill.zip "https://mcp.directory/api/skills/download/7989" && unzip -o skill.zip -d .claude/skills/doubleword-batches && rm skill.zip

Installs to .claude/skills/doubleword-batches

About this skill

Doubleword Batch Inference

Process multiple AI inference requests asynchronously using the Doubleword batch API with high throughput and low cost.

Prerequisites

Before submitting batches, you need:

  1. Doubleword Account - Sign up at https://app.doubleword.ai/
  2. API Key - Create one in the API Keys section of your dashboard
  3. Account Credits - Add credits to process requests (see pricing below)

When to Use Batches

Batches are ideal for:

  • Multiple independent requests that can run simultaneously
  • Workloads that don't require immediate responses
  • Large volumes that would exceed rate limits if sent individually
  • Cost-sensitive workloads (24h window = 50-60% cheaper than realtime)
  • Tool calling and structured output generation at scale

Available Models & Pricing

Pricing is per 1 million tokens (input / output):

Qwen3-VL-30B-A3B-Instruct-FP8 (mid-size):

  • Realtime SLA: $0.16 / $0.80
  • 1-hour SLA: $0.07 / $0.30 (56% cheaper)
  • 24-hour SLA: $0.05 / $0.20 (69% cheaper)

Qwen3-VL-235B-A22B-Instruct-FP8 (flagship):

  • Realtime SLA: $0.60 / $1.20
  • 1-hour SLA: $0.15 / $0.55 (75% cheaper)
  • 24-hour SLA: $0.10 / $0.40 (83% cheaper)
  • Supports up to 262K total tokens, 16K new tokens per request

Cost estimation: Upload files to the Doubleword Console to preview expenses before submitting.

Quick Start

Two ways to submit batches:

Via API:

  1. Create JSONL file with requests
  2. Upload file to get file ID
  3. Create batch using file ID
  4. Poll status until complete
  5. Download results from output_file_id

Via Web Console:

  1. Navigate to Batches section at https://app.doubleword.ai/
  2. Upload JSONL file
  3. Configure batch settings (model, completion window)
  4. Monitor progress in real-time dashboard
  5. Download results when ready

Workflow

Step 1: Create Batch Request File

Create a .jsonl file where each line contains a complete, valid JSON object with no line breaks within the object:

{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is 2+2?"}]}}
{"custom_id": "req-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is the capital of France?"}]}}

Required fields per line:

  • custom_id: Unique identifier (max 64 chars) - use descriptive IDs like "user-123-question-5" for easier result mapping
  • method: Always "POST"
  • url: API endpoint - "/v1/chat/completions" or "/v1/embeddings"
  • body: Standard API request with model and messages

Optional body parameters:

  • temperature: 0-2 (default: 1.0)
  • max_tokens: Maximum response tokens
  • top_p: Nucleus sampling parameter
  • stop: Stop sequences
  • tools: Tool definitions for tool calling (see Tool Calling section)
  • response_format: JSON schema for structured outputs (see Structured Outputs section)

File requirements:

  • Max size: 200MB
  • Format: JSONL only (JSON Lines - newline-delimited JSON)
  • Each line must be valid JSON with no internal line breaks
  • No duplicate custom_id values
  • Split large batches into multiple files if needed

Common pitfalls:

  • Line breaks within JSON objects (will cause parsing errors)
  • Invalid JSON syntax
  • Duplicate custom_id values

Helper script: Use scripts/create_batch_file.py to generate JSONL files programmatically:

python scripts/create_batch_file.py output.jsonl

Modify the script's requests list to generate your specific batch requests.

Step 2: Upload File

Via API:

curl https://api.doubleword.ai/v1/files \
  -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
  -F purpose="batch" \
  -F file="@batch_requests.jsonl"

Via Console: Upload through the Batches section at https://app.doubleword.ai/

Response contains id field - save this file ID for next step.

Step 3: Create Batch

Via API:

curl https://api.doubleword.ai/v1/batches \
  -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "file-abc123",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h"
  }'

Via Console: Configure batch settings in the web interface.

Parameters:

  • input_file_id: File ID from upload step
  • endpoint: API endpoint ("/v1/chat/completions" or "/v1/embeddings")
  • completion_window: Choose based on urgency and budget:
    • "24h": Best pricing, results within 24 hours (typically faster)
    • "1h": 50% price premium, results within 1 hour (typically faster)
    • Realtime: Limited capacity, highest cost (batch service optimized for async)

Response contains batch id - save this for status polling.

Before submitting, verify:

  • You have access to the specified model
  • Your API key is active
  • You have sufficient account credits

Step 4: Poll Status

Via API:

curl https://api.doubleword.ai/v1/batches/batch-xyz789 \
  -H "Authorization: Bearer $DOUBLEWORD_API_KEY"

Via Console: Monitor real-time progress in the Batches dashboard.

Status progression:

  1. validating - Checking input file format
  2. in_progress - Processing requests
  3. completed - All requests finished

Other statuses:

  • failed - Batch failed (check error_file_id)
  • expired - Batch timed out
  • cancelling/cancelled - Batch cancelled

Response includes:

  • output_file_id - Download results here
  • error_file_id - Failed requests (if any)
  • request_counts - Total/completed/failed counts

Polling frequency: Check every 30-60 seconds during processing.

Early access: Results available via output_file_id before batch fully completes - check X-Incomplete header.

Step 5: Download Results

Via API:

curl https://api.doubleword.ai/v1/files/file-output123/content \
  -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
  > results.jsonl

Via Console: Download results directly from the Batches dashboard.

Response headers:

  • X-Incomplete: true - Batch still processing, more results coming
  • X-Last-Line: 45 - Resume point for partial downloads

Output format (each line):

{
  "id": "batch-req-abc",
  "custom_id": "request-1",
  "response": {
    "status_code": 200,
    "body": {
      "id": "chatcmpl-xyz",
      "choices": [{
        "message": {
          "role": "assistant",
          "content": "The answer is 4."
        }
      }]
    }
  }
}

Download errors (if any):

curl https://api.doubleword.ai/v1/files/file-error123/content \
  -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
  > errors.jsonl

Error format (each line):

{
  "id": "batch-req-def",
  "custom_id": "request-2",
  "error": {
    "code": "invalid_request",
    "message": "Missing required parameter"
  }
}

Tool Calling in Batches

Tool calling (function calling) enables models to intelligently select and use external tools. Doubleword maintains full OpenAI compatibility.

Example batch request with tools:

{
  "custom_id": "tool-req-1",
  "method": "POST",
  "url": "/v1/chat/completions",
  "body": {
    "model": "anthropic/claude-3-5-sonnet",
    "messages": [{"role": "user", "content": "What's the weather in Paris?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string"}
          },
          "required": ["location"]
        }
      }
    }]
  }
}

Use cases:

  • Agents that interact with APIs at scale
  • Fetching real-time information for multiple queries
  • Executing actions through standardized tool definitions

Structured Outputs in Batches

Structured outputs guarantee that model responses conform to your JSON Schema, eliminating issues with missing fields or invalid enum values.

Example batch request with structured output:

{
  "custom_id": "structured-req-1",
  "method": "POST",
  "url": "/v1/chat/completions",
  "body": {
    "model": "anthropic/claude-3-5-sonnet",
    "messages": [{"role": "user", "content": "Extract key info from: John Doe, 30 years old, lives in NYC"}],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person_info",
        "schema": {
          "type": "object",
          "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
            "city": {"type": "string"}
          },
          "required": ["name", "age", "city"]
        }
      }
    }
  }
}

Benefits:

  • Guaranteed schema compliance
  • No missing required keys
  • No hallucinated enum values
  • Seamless OpenAI compatibility

autobatcher: Automatic Batching

autobatcher is a Python client that automatically converts individual API calls into batched requests, reducing costs without code changes.

Installation:

pip install autobatcher

How it works:

  1. Collection Phase: Requests accumulate during a time window (default: 1 second) or until batch size threshold
  2. Batch Submission: Collected requests are submitted together
  3. Result Polling: System monitors for completed responses
  4. Transparent Response: Your code receives standard ChatCompletion responses

Key benefit: Significant cost reduction through automatic batching while writing normal async code using the familiar OpenAI interface.

Documentation: https://github.com/doublewordai/autobatcher

Additional Operations

List All Batches

Via API:

curl https://api.doubleword.ai/v1/batches?limit=10 \
  -H "Authorization: Bearer $DOUBLEWORD_API_KEY"

Via Console: View all batches in the dashboard.

Cancel Batch

Via API:

curl https://api.

---

*Content truncated.*

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

9521,094

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

846846

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

571699

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

548492

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

673466

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

514280

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.