Vectara

Name: Vectara
Rating: 4.6 (36 reviews)
Author: vectara

Official

Connects conversational AI interfaces to Vectara's RAG (Retrieval-Augmented Generation) platform for enhanced search with generated responses. Enables AI assistants to search documents and get both relevant results and AI-generated answers.

Provides a bridge between conversational interfaces and Vectara's Retrieval-Augmented Generation capabilities, enabling powerful search queries that return both relevant results and generated responses with customizable parameters.

26271 views9Local (stdio)

ai ml analytics data

GitHub

What it does

Search documents using RAG queries
Generate AI responses from search results
Customize search parameters and filters
Access Vectara's trusted RAG platform
Configure authentication and rate limiting
Deploy via HTTP, SSE, or STDIO transport

Best for

AI assistants needing document search capabilitiesApplications requiring RAG-powered Q&AConversational interfaces with knowledge basesReducing hallucination in AI responses

Reduced hallucination via trusted RAGMultiple transport modes including secure HTTPBuilt-in authentication and rate limiting

About Vectara

Vectara is an official MCP server published by vectara that provides AI assistants with tools and capabilities via the Model Context Protocol. Leverage Vectara for retrieval augmented generation with AI chat bots that deliver accurate, context-aware responses and advanced search capabilities. It is categorized under ai ml, analytics data.

How to install

You can install Vectara in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Vectara is released under the Apache-2.0 license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Vectara MCP Server

PyPI version License

🔌 Compatible with Claude Desktop, and any other MCP Client!

Vectara MCP is also compatible with any MCP client

The Model Context Protocol (MCP) is an open standard that enables AI systems to interact seamlessly with various data sources and tools, facilitating secure, two-way connections.

Vectara-MCP provides any agentic application with access to fast, reliable RAG with reduced hallucination, powered by Vectara's Trusted RAG platform, through the MCP protocol.

Installation

You can install the package directly from PyPI:

pip install vectara-mcp

Quick Start

Secure by Default (HTTP/SSE with Authentication)

# Start server with secure HTTP transport (DEFAULT)
python -m vectara_mcp
# Server running at http://127.0.0.1:8000 with authentication enabled

Local Development Mode (STDIO)

# For Claude Desktop or local development (less secure)
python -m vectara_mcp --stdio
# ⚠️ Warning: STDIO transport is less secure. Use only for local development.

Configuration Options

# Custom host and port
python -m vectara_mcp --host 0.0.0.0 --port 8080

# SSE transport mode
python -m vectara_mcp --transport sse --path /sse

# Disable authentication (DANGEROUS - dev only)
python -m vectara_mcp --no-auth

Transport Modes

HTTP Transport (Default - Recommended)

Security: Built-in authentication via bearer tokens
Encryption: HTTPS ready
Rate Limiting: 100 requests/minute by default
CORS Protection: Configurable origin validation
Use Case: Production deployments, cloud environments

SSE Transport

Streaming: Server-Sent Events for real-time updates
Authentication: Bearer token support
Compatibility: Works with legacy MCP clients
Use Case: Real-time streaming applications

STDIO Transport

⚠️ Security Warning: No transport-layer security
Performance: Low latency for local communication
Use Case: Local development, Claude Desktop
Requirement: Must be explicitly enabled with --stdio flag

Environment Variables

# Required
export VECTARA_API_KEY="your-api-key"

# Optional
export VECTARA_AUTHORIZED_TOKENS="token1,token2"  # Additional auth tokens
export VECTARA_ALLOWED_ORIGINS="http://localhost:*,https://app.example.com"
export VECTARA_TRANSPORT="http"  # Default transport mode
export VECTARA_AUTH_REQUIRED="true"  # Enforce authentication

Authentication

HTTP/SSE Transport

When using HTTP or SSE transport, authentication is required by default:

# Using curl with bearer token
curl -H "Authorization: Bearer $VECTARA_API_KEY" \
     -H "Content-Type: application/json" \
     -X POST http://localhost:8000/call/ask_vectara \
     -d '{"query": "What is Vectara?", "corpus_keys": ["my-corpus"]}'

# Using X-API-Key header (alternative)
curl -H "X-API-Key: $VECTARA_API_KEY" \
     http://localhost:8000/sse

Disabling Authentication (Development Only)

# ⚠️ NEVER use in production
python -m vectara_mcp --no-auth

Available Tools

API Key Management

setup_vectara_api_key: Configure and validate your Vectara API key for the session (one-time setup).

Args:
- api_key: str, Your Vectara API key - required.
Returns:
- Success confirmation with masked API key or validation error.
clear_vectara_api_key: Clear the stored API key from server memory.

Returns:
- Confirmation message.

Query Tools

ask_vectara: Run a RAG query using Vectara, returning search results with a generated response.

Args:
- query: str, The user query to run - required.
- corpus_keys: list[str], List of Vectara corpus keys to use for the search - required.
- n_sentences_before: int, Number of sentences before the answer to include in the context - optional, default is 2.
- n_sentences_after: int, Number of sentences after the answer to include in the context - optional, default is 2.
- lexical_interpolation: float, The amount of lexical interpolation to use - optional, default is 0.005.
- max_used_search_results: int, The maximum number of search results to use - optional, default is 10.
- generation_preset_name: str, The name of the generation preset to use - optional, default is "vectara-summary-table-md-query-ext-jan-2025-gpt-4o".
- response_language: str, The language of the response - optional, default is "eng".
Returns:
- The response from Vectara, including the generated answer and the search results.
search_vectara: Run a semantic search query using Vectara, without generation.

Args:
- query: str, The user query to run - required.
- corpus_keys: list[str], List of Vectara corpus keys to use for the search - required.
- n_sentences_before: int, Number of sentences before the answer to include in the context - optional, default is 2.
- n_sentences_after: int, Number of sentences after the answer to include in the context - optional, default is 2.
- lexical_interpolation: float, The amount of lexical interpolation to use - optional, default is 0.005.
Returns:
- The response from Vectara, including the matching search results.

Analysis Tools

correct_hallucinations: Identify and correct hallucinations in generated text using Vectara's VHC (Vectara Hallucination Correction) API.

Args:
- generated_text: str, The generated text to analyze for hallucinations - required.
- documents: list[str], List of source documents to compare against - required.
- query: str, The original user query that led to the generated text - optional.
Returns:
- JSON-formatted string containing corrected text and detailed correction information.
eval_factual_consistency: Evaluate the factual consistency of generated text against source documents using Vectara's dedicated factual consistency evaluation API.

Args:
- generated_text: str, The generated text to evaluate for factual consistency - required.
- documents: list[str], List of source documents to compare against - required.
- query: str, The original user query that led to the generated text - optional.
Returns:
- JSON-formatted string containing factual consistency evaluation results and scoring.

Note: API key must be configured first using setup_vectara_api_key tool or VECTARA_API_KEY environment variable.

Configuration with Claude Desktop

To use with Claude Desktop, update your configuration to use STDIO transport:

{
  "mcpServers": {
    "Vectara": {
      "command": "python",
      "args": ["-m", "vectara_mcp", "--stdio"],
      "env": {
        "VECTARA_API_KEY": "your-api-key"
      }
    }
  }
}

Or using uv:

{
  "mcpServers": {
    "Vectara": {
      "command": "uv",
      "args": ["tool", "run", "vectara-mcp", "--stdio"]
    }
  }
}

Note: Claude Desktop requires STDIO transport. While less secure than HTTP, it's acceptable for local desktop use.

Usage in Claude Desktop App

Once the installation is complete, and the Claude desktop app is configured, you must completely close and re-open the Claude desktop app to see the Vectara-mcp server. You should see a hammer icon in the bottom left of the app, indicating available MCP tools, you can click on the hammer icon to see more detail on the Vectara-search and Vectara-extract tools.

Now claude will have complete access to the Vectara-mcp server, including all six Vectara tools.

Secure Setup Workflow

First-time setup (one-time per session):

Configure your API key securely:

setup-vectara-api-key
API key: [your-vectara-api-key]

After setup, use any tools without exposing your API key:

Vectara Tool Examples

RAG Query with Generation:

ask-vectara
Query: Who is Amr Awadallah?
Corpus keys: ["your-corpus-key"]

Semantic Search Only:

search-vectara
Query: events in NYC?
Corpus keys: ["your-corpus-key"]

Hallucination Detection & Correction:

correct-hallucinations
Generated text: [text to check]
Documents: ["source1", "source2"]

Factual Consistency Evaluation:

eval-factual-consistency
Generated text: [text to evaluate]
Documents: ["reference1", "reference2"]

Security Best Practices

Always use HTTP transport for production - Never expose STDIO transport to the network
Keep authentication enabled - Only disable with --no-auth for local testing
Use HTTPS in production - Deploy behind a reverse proxy with TLS termination
Configure CORS properly - Set VECTARA_ALLOWED_ORIGINS to restrict access
Rotate API keys regularly - Update VECTARA_API_KEY and VECTARA_AUTHORIZED_TOKENS
Monitor rate limits - Default 100 req/min, adjust based on your needs

See SECURITY.md for detailed security guidelines.

Support

For issues, questions, or contributions, please visit: https://github.com/vectara/vectara-mcp

Alternatives

Knowledge Graph Memory

anthropic

80.5k

Build persistent semantic networks for enterprise & engineering data management. Enable data persistence and memory across chats efficiently.

OfficialPopular

2.7k171

Context7

upstash

48.2k

Boost your AI code assistant with Context7: inject real-time API documentation from OpenAPI specification sources into your coding workflow.

OfficialRemotePopular

17.3k832

Blender

ahujasid

17.6k

Connect Blender to Claude AI for seamless 3D modeling. Use AI 3D model generator tools for faster, intuitive, interactive 3D scene creation.

CommunityPopular

3.1k52

Google GenAI Toolbox

google

13.3k

Google GenAI Toolbox: open-source GenAI database agent and AI database connector for Google Cloud database—query Cloud SQL connector, Spanner & AlloyDB with…

OfficialPopular

330

Related Skills

Browse all skills

data-storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

google-analytics

Analyze Google Analytics data, review website performance metrics, identify traffic patterns, and suggest data-driven improvements. Use when the user asks about analytics, website metrics, traffic analysis, conversion rates, user behavior, or performance optimization.

content-trend-researcher

Advanced content and topic research skill that analyzes trends across Google Analytics, Google Trends, Substack, Medium, Reddit, LinkedIn, X, blogs, podcasts, and YouTube to generate data-driven article outlines based on user intent analysis

data-scientist

Expert data scientist for advanced analytics, machine learning, and statistical modeling. Handles complex data analysis, predictive modeling, and business intelligence. Use PROACTIVELY for data analysis tasks, ML modeling, statistical analysis, and data-driven insights.

youtube-analytics

YouTube Data API v3 analytics toolkit. Analyze YouTube channels, videos, and search results. Use when the user asks to: check YouTube channel stats, analyze video performance, compare channels, search for videos, get subscriber counts, view engagement metrics, find trending videos, get channel uploads, or analyze YouTube competition. Requires a YouTube Data API v3 key from Google Cloud Console.

backend-dev-guidelines

Comprehensive backend development guide for Langfuse's Next.js 14/tRPC/Express/TypeScript monorepo. Use when creating tRPC routers, public API endpoints, BullMQ queue processors, services, or working with tRPC procedures, Next.js API routes, Prisma database access, ClickHouse analytics queries, Redis queues, OpenTelemetry instrumentation, Zod v4 validation, env.mjs configuration, tenant isolation patterns, or async patterns. Covers layered architecture (tRPC procedures → services, queue processors → services), dual database system (PostgreSQL + ClickHouse), projectId filtering for multi-tenant isolation, traceException error handling, observability patterns, and testing strategies (Jest for web, vitest for worker).