deepgram-performance-tuning
Optimize Deepgram API performance for faster transcription and lower latency. Use when improving transcription speed, reducing latency, or optimizing audio processing pipelines. Trigger with phrases like "deepgram performance", "speed up deepgram", "optimize transcription", "deepgram latency", "deepgram faster".
Install
mkdir -p .claude/skills/deepgram-performance-tuning && curl -L -o skill.zip "https://mcp.directory/api/skills/download/1590" && unzip -o skill.zip -d .claude/skills/deepgram-performance-tuning && rm skill.zipInstalls to .claude/skills/deepgram-performance-tuning
About this skill
Deepgram Performance Tuning
Overview
Optimize Deepgram integration performance through audio preprocessing, connection management, and configuration tuning.
Prerequisites
- Working Deepgram integration
- Performance monitoring in place
- Audio processing capabilities
- Baseline metrics established
Performance Factors
| Factor | Impact | Optimization |
|---|---|---|
| Audio Format | High | Use optimal encoding |
| Sample Rate | Medium | Match model requirements |
| File Size | High | Stream large files |
| Model Choice | High | Balance accuracy vs speed |
| Network Latency | Medium | Use closest region |
| Concurrency | Medium | Manage connections |
Instructions
Step 1: Optimize Audio Format
Preprocess audio for optimal transcription.
Step 2: Configure Connection Pooling
Reuse connections for better throughput.
Step 3: Tune API Parameters
Select appropriate model and features.
Step 4: Implement Streaming
Use streaming for real-time and large files.
Examples
Audio Preprocessing
// lib/audio-optimizer.ts
import ffmpeg from 'fluent-ffmpeg';
import { Readable } from 'stream';
interface OptimizedAudio {
buffer: Buffer;
mimetype: string;
sampleRate: number;
channels: number;
duration: number;
}
export async function optimizeAudio(inputPath: string): Promise<OptimizedAudio> {
return new Promise((resolve, reject) => {
const chunks: Buffer[] = [];
// Optimal settings for Deepgram
ffmpeg(inputPath)
.audioCodec('pcm_s16le') // 16-bit PCM
.audioChannels(1) // Mono
.audioFrequency(16000) // 16kHz (optimal for speech)
.format('wav')
.on('error', reject)
.on('end', () => {
const buffer = Buffer.concat(chunks);
resolve({
buffer,
mimetype: 'audio/wav',
sampleRate: 16000,
channels: 1,
duration: buffer.length / (16000 * 2), // 16-bit = 2 bytes
});
})
.pipe()
.on('data', (chunk: Buffer) => chunks.push(chunk));
});
}
// For already loaded audio data
export async function optimizeAudioBuffer(
audioBuffer: Buffer,
inputFormat: string
): Promise<Buffer> {
return new Promise((resolve, reject) => {
const chunks: Buffer[] = [];
const readable = new Readable();
readable.push(audioBuffer);
readable.push(null);
ffmpeg(readable)
.inputFormat(inputFormat)
.audioCodec('pcm_s16le')
.audioChannels(1)
.audioFrequency(16000)
.format('wav')
.on('error', reject)
.on('end', () => resolve(Buffer.concat(chunks)))
.pipe()
.on('data', (chunk: Buffer) => chunks.push(chunk));
});
}
Connection Pooling
// lib/connection-pool.ts
import { createClient, DeepgramClient } from '@deepgram/sdk';
interface PoolConfig {
minSize: number;
maxSize: number;
acquireTimeout: number;
idleTimeout: number;
}
class DeepgramConnectionPool {
private pool: DeepgramClient[] = [];
private inUse: Set<DeepgramClient> = new Set();
private waiting: Array<(client: DeepgramClient) => void> = [];
private config: PoolConfig;
private apiKey: string;
constructor(apiKey: string, config: Partial<PoolConfig> = {}) {
this.apiKey = apiKey;
this.config = {
minSize: config.minSize ?? 2,
maxSize: config.maxSize ?? 10,
acquireTimeout: config.acquireTimeout ?? 10000,
idleTimeout: config.idleTimeout ?? 60000,
};
// Initialize minimum connections
for (let i = 0; i < this.config.minSize; i++) {
this.pool.push(createClient(this.apiKey));
}
}
async acquire(): Promise<DeepgramClient> {
// Try to get from pool
if (this.pool.length > 0) {
const client = this.pool.pop()!;
this.inUse.add(client);
return client;
}
// Create new if under max
if (this.inUse.size < this.config.maxSize) {
const client = createClient(this.apiKey);
this.inUse.add(client);
return client;
}
// Wait for available connection
return new Promise((resolve, reject) => {
const timeout = setTimeout(() => {
const index = this.waiting.indexOf(resolve);
if (index > -1) this.waiting.splice(index, 1);
reject(new Error('Connection acquire timeout'));
}, this.config.acquireTimeout);
this.waiting.push((client) => {
clearTimeout(timeout);
resolve(client);
});
});
}
release(client: DeepgramClient): void {
this.inUse.delete(client);
if (this.waiting.length > 0) {
const waiter = this.waiting.shift()!;
this.inUse.add(client);
waiter(client);
} else {
this.pool.push(client);
}
}
async execute<T>(fn: (client: DeepgramClient) => Promise<T>): Promise<T> {
const client = await this.acquire();
try {
return await fn(client);
} finally {
this.release(client);
}
}
getStats() {
return {
poolSize: this.pool.length,
inUse: this.inUse.size,
waiting: this.waiting.length,
};
}
}
export const pool = new DeepgramConnectionPool(process.env.DEEPGRAM_API_KEY!);
Streaming for Large Files
// lib/streaming-transcription.ts
import { createClient } from '@deepgram/sdk';
import { createReadStream, statSync } from 'fs';
interface StreamingOptions {
chunkSize: number;
model: string;
}
export async function streamLargeFile(
filePath: string,
options: Partial<StreamingOptions> = {}
): Promise<string> {
const { chunkSize = 1024 * 1024, model = 'nova-2' } = options;
const client = createClient(process.env.DEEPGRAM_API_KEY!);
const fileSize = statSync(filePath).size;
const transcripts: string[] = [];
// Use live transcription for streaming
const connection = client.listen.live({
model,
smart_format: true,
punctuate: true,
});
return new Promise((resolve, reject) => {
connection.on('open', () => {
const stream = createReadStream(filePath, { highWaterMark: chunkSize });
stream.on('data', (chunk: Buffer) => {
connection.send(chunk);
});
stream.on('end', () => {
connection.finish();
});
stream.on('error', reject);
});
connection.on('transcript', (data) => {
if (data.is_final) {
transcripts.push(data.channel.alternatives[0].transcript);
}
});
connection.on('close', () => {
resolve(transcripts.join(' '));
});
connection.on('error', reject);
});
}
Model Selection for Speed
// lib/model-selector.ts
interface ModelConfig {
name: string;
accuracy: 'high' | 'medium' | 'low';
speed: 'fast' | 'medium' | 'slow';
costPerMinute: number;
}
const models: Record<string, ModelConfig> = {
'nova-2': {
name: 'Nova-2',
accuracy: 'high',
speed: 'fast',
costPerMinute: 0.0043,
},
'nova': {
name: 'Nova',
accuracy: 'high',
speed: 'fast',
costPerMinute: 0.0043,
},
'enhanced': {
name: 'Enhanced',
accuracy: 'medium',
speed: 'fast',
costPerMinute: 0.0145,
},
'base': {
name: 'Base',
accuracy: 'low',
speed: 'fast',
costPerMinute: 0.0048,
},
};
export function selectModel(requirements: {
prioritize: 'accuracy' | 'speed' | 'cost';
minAccuracy?: 'high' | 'medium' | 'low';
}): string {
const { prioritize, minAccuracy = 'low' } = requirements;
const accuracyOrder = ['high', 'medium', 'low'];
const minAccuracyIndex = accuracyOrder.indexOf(minAccuracy);
const eligible = Object.entries(models).filter(([_, config]) =>
accuracyOrder.indexOf(config.accuracy) <= minAccuracyIndex
);
if (prioritize === 'accuracy') {
return eligible.reduce((best, [name, config]) =>
accuracyOrder.indexOf(config.accuracy) < accuracyOrder.indexOf(models[best].accuracy)
? name : best
, eligible[0][0]);
}
if (prioritize === 'cost') {
return eligible.reduce((best, [name, config]) =>
config.costPerMinute < models[best].costPerMinute ? name : best
, eligible[0][0]);
}
// Default: balance speed and accuracy
return 'nova-2';
}
Parallel Processing
// lib/parallel-transcription.ts
import { pool } from './connection-pool';
import pLimit from 'p-limit';
interface TranscriptionResult {
file: string;
transcript: string;
duration: number;
}
export async function transcribeMultiple(
audioUrls: string[],
concurrency = 5
): Promise<TranscriptionResult[]> {
const limit = pLimit(concurrency);
const startTime = Date.now();
const results = await Promise.all(
audioUrls.map((url, index) =>
limit(async () => {
const itemStart = Date.now();
const result = await pool.execute(async (client) => {
const { result, error } = await client.listen.prerecorded.transcribeUrl(
{ url },
{ model: 'nova-2', smart_format: true }
);
if (error) throw error;
return result;
});
return {
file: url,
transcript: result.results.channels[0].alternatives[0].transcript,
duration: Date.now() - itemStart,
};
})
)
);
console.log(`Processed ${audioUrls.length} files in ${Date.now() - startTime}ms`);
console.log(`Average per file: ${(Date.now() - startTime) / audioUrls.length}ms`);
return results;
}
Caching Results
// lib/transcription-cache.ts
import { createHash } from 'crypto';
import { redis } from './redis';
interface CacheOptions {
ttl: number; // seconds
}
export class TranscriptionCache {
private ttl: number;
constructor(options: Partial<CacheOptions> = {}) {
this.ttl = options.ttl ?? 3600; // 1 hour default
}
private getCacheKey(audioUrl: string, options: Record<string, unknown>): string {
const hash = createHash('sha256')
.update(JSON.stringify({ audioUrl, options }))
.digest('hex');
retur
---
*Content truncated.*
More by jeremylongshore
View all skills by jeremylongshore →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
pdf-to-markdown
aliceisjustplaying
Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.
Related MCP Servers
Browse all serversBoost AI coding agents with Ref Tools—efficient documentation access for faster, smarter code generation than GitHub Cop
Optimize Facebook ad campaigns with AI-driven insights, creative analysis, and campaign control in Meta Ads Manager for
Fast, local-first web content extraction for LLMs. Scrape, crawl, extract structured data — all from Rust. CLI, REST API
Use Google Lighthouse to check web page performance and optimize website speed. Try our landing page optimizer for bette
Octagon Deep Research MCP — an AI research platform and deep research AI with unlimited AI queries, AI research tools, a
Notion ReadOnly offers a fast, read-only interface for Notion content, using parallel processing and caching for efficie
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.