hugging-face-jobs

4
1
Source

This skill should be used when users want to run any workload on Hugging Face Jobs infrastructure. Covers UV scripts, Docker-based jobs, hardware selection, cost estimation, authentication with tokens, secrets management, timeout configuration, and result persistence. Designed for general-purpose compute workloads including data processing, inference, experiments, batch jobs, and any Python-based tasks. Should be invoked for tasks involving cloud compute, GPU workloads, or when users mention running jobs on Hugging Face infrastructure without local setup.

Install

mkdir -p .claude/skills/hugging-face-jobs && curl -L -o skill.zip "https://mcp.directory/api/skills/download/4472" && unzip -o skill.zip -d .claude/skills/hugging-face-jobs && rm skill.zip

Installs to .claude/skills/hugging-face-jobs

About this skill

Running Workloads on Hugging Face Jobs

Overview

Run any workload on fully managed Hugging Face infrastructure. No local setup required—jobs run on cloud CPUs, GPUs, or TPUs and can persist results to the Hugging Face Hub.

Common use cases:

  • Data Processing - Transform, filter, or analyze large datasets
  • Batch Inference - Run inference on thousands of samples
  • Experiments & Benchmarks - Reproducible ML experiments
  • Model Training - Fine-tune models (see model-trainer skill for TRL-specific training)
  • Synthetic Data Generation - Generate datasets using LLMs
  • Development & Testing - Test code without local GPU setup
  • Scheduled Jobs - Automate recurring tasks

For model training specifically: See the model-trainer skill for TRL-based training workflows.

When to Use This Skill

Use this skill when users want to:

  • Run Python workloads on cloud infrastructure
  • Execute jobs without local GPU/TPU setup
  • Process data at scale
  • Run batch inference or experiments
  • Schedule recurring tasks
  • Use GPUs/TPUs for any workload
  • Persist results to the Hugging Face Hub

Key Directives

When assisting with jobs:

  1. ALWAYS use hf_jobs() MCP tool - Submit jobs using hf_jobs("uv", {...}) or hf_jobs("run", {...}). The script parameter accepts Python code directly. Do NOT save to local files unless the user explicitly requests it. Pass the script content as a string to hf_jobs().

  2. Always handle authentication - Jobs that interact with the Hub require HF_TOKEN via secrets. See Token Usage section below.

  3. Provide job details after submission - After submitting, provide job ID, monitoring URL, estimated time, and note that the user can request status checks later.

  4. Set appropriate timeouts - Default 30min may be insufficient for long-running tasks.

Prerequisites Checklist

Before starting any job, verify:

Account & Authentication

  • Hugging Face Account with Pro, Team, or Enterprise plan (Jobs require paid plan)
  • Authenticated login: Check with hf_whoami()
  • HF_TOKEN for Hub Access ⚠️ CRITICAL - Required for any Hub operations (push models/datasets, download private repos, etc.)
  • Token must have appropriate permissions (read for downloads, write for uploads)

Token Usage (See Token Usage section for details)

When tokens are required:

  • Pushing models/datasets to Hub
  • Accessing private repositories
  • Using Hub APIs in scripts
  • Any authenticated Hub operations

How to provide tokens:

{
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # Recommended: automatic token
}

⚠️ CRITICAL: The $HF_TOKEN placeholder is automatically replaced with your logged-in token. Never hardcode tokens in scripts.

Token Usage Guide

Understanding Tokens

What are HF Tokens?

  • Authentication credentials for Hugging Face Hub
  • Required for authenticated operations (push, private repos, API access)
  • Stored securely on your machine after hf auth login

Token Types:

  • Read Token - Can download models/datasets, read private repos
  • Write Token - Can push models/datasets, create repos, modify content
  • Organization Token - Can act on behalf of an organization

When Tokens Are Required

Always Required:

  • Pushing models/datasets to Hub
  • Accessing private repositories
  • Creating new repositories
  • Modifying existing repositories
  • Using Hub APIs programmatically

Not Required:

  • Downloading public models/datasets
  • Running jobs that don't interact with Hub
  • Reading public repository information

How to Provide Tokens to Jobs

Method 1: Automatic Token (Recommended)

hf_jobs("uv", {
    "script": "your_script.py",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # ✅ Automatic replacement
})

How it works:

  • $HF_TOKEN is a placeholder that gets replaced with your actual token
  • Uses the token from your logged-in session (hf auth login)
  • Most secure and convenient method
  • Token is encrypted server-side when passed as a secret

Benefits:

  • No token exposure in code
  • Uses your current login session
  • Automatically updated if you re-login
  • Works seamlessly with MCP tools

Method 2: Explicit Token (Not Recommended)

hf_jobs("uv", {
    "script": "your_script.py",
    "secrets": {"HF_TOKEN": "hf_abc123..."}  # ⚠️ Hardcoded token
})

When to use:

  • Only if automatic token doesn't work
  • Testing with a specific token
  • Organization tokens (use with caution)

Security concerns:

  • Token visible in code/logs
  • Must manually update if token rotates
  • Risk of token exposure

Method 3: Environment Variable (Less Secure)

hf_jobs("uv", {
    "script": "your_script.py",
    "env": {"HF_TOKEN": "hf_abc123..."}  # ⚠️ Less secure than secrets
})

Difference from secrets:

  • env variables are visible in job logs
  • secrets are encrypted server-side
  • Always prefer secrets for tokens

Using Tokens in Scripts

In your Python script, tokens are available as environment variables:

# /// script
# dependencies = ["huggingface-hub"]
# ///

import os
from huggingface_hub import HfApi

# Token is automatically available if passed via secrets
token = os.environ.get("HF_TOKEN")

# Use with Hub API
api = HfApi(token=token)

# Or let huggingface_hub auto-detect
api = HfApi()  # Automatically uses HF_TOKEN env var

Best practices:

  • Don't hardcode tokens in scripts
  • Use os.environ.get("HF_TOKEN") to access
  • Let huggingface_hub auto-detect when possible
  • Verify token exists before Hub operations

Token Verification

Check if you're logged in:

from huggingface_hub import whoami
user_info = whoami()  # Returns your username if authenticated

Verify token in job:

import os
assert "HF_TOKEN" in os.environ, "HF_TOKEN not found!"
token = os.environ["HF_TOKEN"]
print(f"Token starts with: {token[:7]}...")  # Should start with "hf_"

Common Token Issues

Error: 401 Unauthorized

  • Cause: Token missing or invalid
  • Fix: Add secrets={"HF_TOKEN": "$HF_TOKEN"} to job config
  • Verify: Check hf_whoami() works locally

Error: 403 Forbidden

Error: Token not found in environment

  • Cause: secrets not passed or wrong key name
  • Fix: Use secrets={"HF_TOKEN": "$HF_TOKEN"} (not env)
  • Verify: Script checks os.environ.get("HF_TOKEN")

Error: Repository access denied

  • Cause: Token doesn't have access to private repo
  • Fix: Use token from account with access
  • Check: Verify repo visibility and your permissions

Token Security Best Practices

  1. Never commit tokens - Use $HF_TOKEN placeholder or environment variables
  2. Use secrets, not env - Secrets are encrypted server-side
  3. Rotate tokens regularly - Generate new tokens periodically
  4. Use minimal permissions - Create tokens with only needed permissions
  5. Don't share tokens - Each user should use their own token
  6. Monitor token usage - Check token activity in Hub settings

Complete Token Example

# Example: Push results to Hub
hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["huggingface-hub", "datasets"]
# ///

import os
from huggingface_hub import HfApi
from datasets import Dataset

# Verify token is available
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"

# Use token for Hub operations
api = HfApi(token=os.environ["HF_TOKEN"])

# Create and push dataset
data = {"text": ["Hello", "World"]}
dataset = Dataset.from_dict(data)
dataset.push_to_hub("username/my-dataset", token=os.environ["HF_TOKEN"])

print("✅ Dataset pushed successfully!")
""",
    "flavor": "cpu-basic",
    "timeout": "30m",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # ✅ Token provided securely
})

Quick Start: Two Approaches

Approach 1: UV Scripts (Recommended)

UV scripts use PEP 723 inline dependencies for clean, self-contained workloads.

MCP Tool:

hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["transformers", "torch"]
# ///

from transformers import pipeline
import torch

# Your workload here
classifier = pipeline("sentiment-analysis")
result = classifier("I love Hugging Face!")
print(result)
""",
    "flavor": "cpu-basic",
    "timeout": "30m"
})

CLI Equivalent:

hf jobs uv run my_script.py --flavor cpu-basic --timeout 30m

Python API:

from huggingface_hub import run_uv_job
run_uv_job("my_script.py", flavor="cpu-basic", timeout="30m")

Benefits: Direct MCP tool usage, clean code, dependencies declared inline, no file saving required

When to use: Default choice for all workloads, custom logic, any scenario requiring hf_jobs()

Custom Docker Images for UV Scripts

By default, UV scripts use ghcr.io/astral-sh/uv:python3.12-bookworm-slim. For ML workloads with complex dependencies, use pre-built images:

hf_jobs("uv", {
    "script": "inference.py",
    "image": "vllm/vllm-openai:latest",  # Pre-built image with vLLM
    "flavor": "a10g-large"
})

CLI:

hf jobs uv run --image vllm/vllm-openai:latest --flavor a10g-large inference.py

Benefits: Faster startup, pre-installed dependencies, optimized for specific frameworks

Python Version

By default, UV scripts use Python 3.12. Specify a different version:



---

*Content truncated.*

hugging-face-tool-builder

patchy631

Use this skill when the user wants to build tool/scripts or achieve a task where using data from the Hugging Face API would help. This is especially useful when chaining or combining API calls or the task will be repeated/automated. This Skill creates a reusable script to fetch, enrich or process data.

146

hugging-face-paper-publisher

patchy631

Publish and manage research papers on Hugging Face Hub. Supports creating paper pages, linking papers to models/datasets, claiming authorship, and generating professional markdown-based research articles.

345

brightdata-web-mcp

patchy631

Search the web, scrape websites, extract structured data from URLs, and automate browsers using Bright Data's Web MCP. Use when fetching live web content, bypassing blocks/CAPTCHAs, getting product data from Amazon/eBay, social media posts, or when standard requests fail.

485

hugging-face-cli

patchy631

Execute Hugging Face Hub operations using the `hf` CLI. Use when the user needs to download models/datasets/spaces, upload files to Hub repositories, create repos, manage local cache, or run compute jobs on HF infrastructure. Covers authentication, file transfers, repository creation, cache operations, and cloud compute.

442

hugging-face-datasets

patchy631

Create and manage datasets on Hugging Face Hub. Supports initializing repos, defining configs/system prompts, streaming row updates, and SQL-based dataset querying/transformation. Designed to work alongside HF MCP server for comprehensive dataset workflows.

41

hugging-face-model-trainer

patchy631

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

31

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,5681,368

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

1,1151,186

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,4161,108

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

1,192747

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,151683

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,308612

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.