speak-core-workflow-a

0
0
Source

Execute Speak primary workflow: AI Conversation Practice with real-time feedback. Use when implementing conversation practice features, building AI tutor interactions, or core language learning dialogue systems. Trigger with phrases like "speak conversation practice", "speak AI tutor", "speak dialogue", "primary speak workflow".

Install

mkdir -p .claude/skills/speak-core-workflow-a && curl -L -o skill.zip "https://mcp.directory/api/skills/download/9042" && unzip -o skill.zip -d .claude/skills/speak-core-workflow-a && rm skill.zip

Installs to .claude/skills/speak-core-workflow-a

About this skill

Speak Core Workflow A: AI Conversation Practice

Overview

Primary workflow for Speak: AI-powered conversation practice with real-time pronunciation feedback and adaptive tutoring. Speak uses GPT-4o for conversation generation and OpenAI's Realtime API for speech processing, delivering sub-second response times.

Prerequisites

  • Completed speak-install-auth setup
  • Valid API credentials configured
  • Audio handling capabilities (microphone or pre-recorded files)

Instructions

Step 1: Start a Conversation Session

import { SpeakClient } from '@speak/language-sdk';

const client = new SpeakClient({
  apiKey: process.env.SPEAK_API_KEY!,
  appId: process.env.SPEAK_APP_ID!,
  language: 'es',
});

// Start a restaurant ordering scenario in Spanish
const session = await client.startConversation({
  scenario: 'ordering-food',
  language: 'es',
  level: 'intermediate',
  nativeLanguage: 'en',
  maxTurns: 10,
  feedbackDetail: 'phoneme', // 'word' or 'phoneme'
});

console.log('Session started:', session.id);
console.log('AI Tutor:', session.firstPrompt.text);
// "Bienvenido al restaurante. Soy tu camarero. Que le gustaria ordenar?"

Step 2: Send Student Responses

// Submit audio for pronunciation scoring
const turn1 = await client.sendTurn(session.id, {
  audioPath: './recordings/student-response-1.wav',
});

console.log('Tutor:', turn1.tutorText);
console.log('Pronunciation:', turn1.pronunciationScore); // 0-100
console.log('Grammar:', turn1.corrections);
// [{original: "yo quiero", suggestion: "quisiera", note: "More polite form for ordering"}]
console.log('Vocabulary:', turn1.vocabularyNotes);
// ["camarero = waiter", "ordenar = to order"]

// Or submit text (skips pronunciation scoring)
const turn2 = await client.sendTurn(session.id, {
  text: 'Quisiera una ensalada y un vaso de agua, por favor.',
});

Step 3: Conversation Loop with Progress Tracking

async function runConversationLesson(
  client: SpeakClient,
  scenario: string,
  language: string,
  level: string,
) {
  const session = await client.startConversation({
    scenario, language, level, nativeLanguage: 'en',
  });

  const turns: TurnResult[] = [];
  let isComplete = false;

  while (!isComplete && turns.length < 10) {
    // Display tutor prompt
    const prompt = turns.length === 0
      ? session.firstPrompt.text
      : turns[turns.length - 1].tutorText;
    console.log(`\nTutor: ${prompt}`);

    // Get student audio (mic input or file)
    const audioPath = await recordStudentAudio();

    // Submit and get feedback
    const turn = await client.sendTurn(session.id, { audioPath });
    turns.push(turn);

    // Show feedback
    if (turn.pronunciationScore < 60) {
      console.log(`Pronunciation needs work: ${turn.pronunciationScore}/100`);
      console.log('Try again with this phrase.');
    }
    if (turn.corrections.length > 0) {
      console.log('Grammar notes:', turn.corrections.map(c => c.note).join('; '));
    }

    isComplete = turn.sessionComplete;
  }

  // End session and get summary
  const summary = await client.endSession(session.id);
  return summary;
}

Step 4: Multi-Topic Session

const topics = ['greetings', 'directions', 'ordering-food', 'shopping'];
const results: SessionSummary[] = [];

for (const topic of topics) {
  console.log(`\n=== ${topic.toUpperCase()} ===`);
  const summary = await runConversationLesson(client, topic, 'es', 'intermediate');
  results.push(summary);
  console.log(`Score: ${summary.avgPronunciationScore}/100`);
}

// Overall progress report
console.log('\n=== Session Report ===');
console.table(results.map(r => ({
  topic: r.scenario,
  pronunciation: r.avgPronunciationScore,
  grammar: r.grammarAccuracy + '%',
  newWords: r.newWords.length,
  duration: r.durationMinutes + 'min',
})));

Topic Categories

CategoryScenariosLevel
Daily Lifegreetings, introductions, weatherBeginner
Traveldirections, hotel, airport, transportBeginner-Intermediate
Food & Drinkordering-food, grocery, cookingIntermediate
Businessmeeting, presentation, negotiationIntermediate-Advanced
Socialparty, dating, opinions, debateAdvanced

Output

  • Conversation session with AI tutor
  • Real-time pronunciation feedback (0-100 score)
  • Grammar corrections and suggestions
  • Vocabulary notes for new words
  • Session summary with progress metrics

Error Handling

ErrorCauseSolution
Session timeoutExceeded 30 minAuto-end with summary, start new session
Audio processing failedInvalid formatConvert to WAV 16kHz mono
Tutor not respondingAPI latencyImplement 10s timeout with retry
Recognition failedPoor audio qualityPrompt user to re-record in quiet environment

Resources

Next Steps

For pronunciation-focused training, see speak-core-workflow-b.

Examples

Quick test: Start a greetings scenario with level: 'beginner', send 3 text responses, end session, and review the summary scores.

Full lesson: Run 4 topics in sequence, track pronunciation improvement across topics, and generate a progress report.

d2-diagram-creator

jeremylongshore

D2 Diagram Creator - Auto-activating skill for Visual Content. Triggers on: d2 diagram creator, d2 diagram creator Part of the Visual Content skill category.

6532

svg-icon-generator

jeremylongshore

Svg Icon Generator - Auto-activating skill for Visual Content. Triggers on: svg icon generator, svg icon generator Part of the Visual Content skill category.

9029

automating-mobile-app-testing

jeremylongshore

This skill enables automated testing of mobile applications on iOS and Android platforms using frameworks like Appium, Detox, XCUITest, and Espresso. It generates end-to-end tests, sets up page object models, and handles platform-specific elements. Use this skill when the user requests mobile app testing, test automation for iOS or Android, or needs assistance with setting up device farms and simulators. The skill is triggered by terms like "mobile testing", "appium", "detox", "xcuitest", "espresso", "android test", "ios test".

15922

performing-penetration-testing

jeremylongshore

This skill enables automated penetration testing of web applications. It uses the penetration-tester plugin to identify vulnerabilities, including OWASP Top 10 threats, and suggests exploitation techniques. Use this skill when the user requests a "penetration test", "pentest", "vulnerability assessment", or asks to "exploit" a web application. It provides comprehensive reporting on identified security flaws.

4915

designing-database-schemas

jeremylongshore

Design and visualize efficient database schemas, normalize data, map relationships, and generate ERD diagrams and SQL statements.

12014

ollama-setup

jeremylongshore

Configure auto-configure Ollama when user needs local LLM deployment, free AI alternatives, or wants to eliminate hosted API costs. Trigger phrases: "install ollama", "local AI", "free LLM", "self-hosted AI", "replace OpenAI", "no API costs". Use when appropriate context detected. Trigger with relevant phrases based on skill purpose.

5110

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,4071,302

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,2201,024

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

9001,013

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

958658

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

970608

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,033496

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.