
RAG
A low-latency RAG (Retrieval-Augmented Generation) service that lets you upload documents and perform semantic search using OpenAI embeddings with local vector storage. Includes both direct retrieval and LLM-powered summary modes.
4461 views3Local (stdio)
What it does
- Upload and index documents with vector embeddings
- Perform semantic search with cosine similarity
- Generate AI summaries of retrieved content
- Filter documents by metadata
- Configure multiple embedding providers
- Manage documents through web interface
Best for
Building RAG applications with document Q&ALocal knowledge base search and retrievalDocument analysis with AI summarizationPrototyping semantic search features
Sub-100ms local retrievalDual modes: raw retrieval and AI summaryWeb UI for document management