RAGStack MCP Server
MCP (Model Context Protocol) server for RAGStack knowledge bases. Enables AI assistants to search, chat, upload documents/media, and scrape your knowledge base.
Installation
# Using uvx (recommended - no install needed)
uvx ragstack-mcp
# Or install globally
pip install ragstack-mcp
Configuration
Get your GraphQL endpoint and API key from the RAGStack dashboard: Settings → API Key
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (Mac) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"ragstack-kb": {
"command": "uvx",
"args": ["ragstack-mcp"],
"env": {
"RAGSTACK_GRAPHQL_ENDPOINT": "https://xxx.appsync-api.us-east-1.amazonaws.com/graphql",
"RAGSTACK_API_KEY": "da2-xxxxxxxxxxxx"
}
}
}
}
Amazon Q CLI
Edit ~/.aws/amazonq/mcp.json:
{
"mcpServers": {
"ragstack-kb": {
"command": "uvx",
"args": ["ragstack-mcp"],
"env": {
"RAGSTACK_GRAPHQL_ENDPOINT": "https://xxx.appsync-api.us-east-1.amazonaws.com/graphql",
"RAGSTACK_API_KEY": "da2-xxxxxxxxxxxx"
}
}
}
}
Cursor
Open Settings → MCP Servers → Add Server, or edit .cursor/mcp.json:
{
"ragstack-kb": {
"command": "uvx",
"args": ["ragstack-mcp"],
"env": {
"RAGSTACK_GRAPHQL_ENDPOINT": "https://xxx.appsync-api.us-east-1.amazonaws.com/graphql",
"RAGSTACK_API_KEY": "da2-xxxxxxxxxxxx"
}
}
}
VS Code + Cline
Edit .vscode/cline_mcp_settings.json:
{
"mcpServers": {
"ragstack-kb": {
"command": "uvx",
"args": ["ragstack-mcp"],
"env": {
"RAGSTACK_GRAPHQL_ENDPOINT": "https://xxx.appsync-api.us-east-1.amazonaws.com/graphql",
"RAGSTACK_API_KEY": "da2-xxxxxxxxxxxx"
}
}
}
}
VS Code + Continue
Edit ~/.continue/config.json, add to mcpServers array:
{
"mcpServers": [
{
"name": "ragstack-kb",
"command": "uvx",
"args": ["ragstack-mcp"],
"env": {
"RAGSTACK_GRAPHQL_ENDPOINT": "https://xxx.appsync-api.us-east-1.amazonaws.com/graphql",
"RAGSTACK_API_KEY": "da2-xxxxxxxxxxxx"
}
}
]
}
Available Tools
search_knowledge_base
Search for relevant documents in the knowledge base.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query |
string | Yes | - | The search query |
max_results |
int | No | 5 | Maximum results to return |
chat_with_knowledge_base
Ask questions and get AI-generated answers with source citations.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query |
string | Yes | - | Your question |
conversation_id |
string | No | null | ID to maintain conversation context |
start_scrape_job
Scrape a website into the knowledge base.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url |
string | Yes | - | Starting URL to scrape |
max_pages |
int | No | 50 | Maximum pages to scrape |
max_depth |
int | No | 3 | How deep to follow links (0 = start page only) |
scope |
string | No | "HOSTNAME" | SUBPAGES, HOSTNAME, or DOMAIN |
include_patterns |
list[str] | No | null | Only scrape URLs matching these glob patterns |
exclude_patterns |
list[str] | No | null | Skip URLs matching these glob patterns |
scrape_mode |
string | No | "AUTO" | AUTO, FAST (HTTP only), or FULL (browser) |
cookies |
string | No | null | Cookie string for authenticated sites |
force_rescrape |
bool | No | false | Re-scrape even if content unchanged |
Scope values:
SUBPAGES- Only URLs under the starting pathHOSTNAME- All pages on the same subdomainDOMAIN- All subdomains of the domain
Scrape mode values:
AUTO- Try fast mode, fall back to full for SPAsFAST- HTTP only, faster but may miss JavaScript contentFULL- Uses headless browser, handles all JavaScript
get_scrape_job_status
Check the status of a scrape job.
| Parameter | Type | Required | Description |
|---|---|---|---|
job_id |
string | Yes | The scrape job ID |
list_scrape_jobs
List recent scrape jobs.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit |
int | No | 10 | Maximum jobs to return |
upload_document_url
Get a presigned URL to upload a document or media file.
| Parameter | Type | Required | Description |
|---|---|---|---|
filename |
string | Yes | Name of the file (e.g., 'report.pdf', 'meeting.mp4') |
Supported formats:
- Documents: PDF, DOCX, XLSX, HTML, TXT, CSV, JSON, XML, EML, EPUB, Markdown
- Images: JPG, PNG, GIF, WebP, AVIF, BMP, TIFF
- Video: MP4, WebM
- Audio: MP3, WAV, M4A, OGG, FLAC
Video/audio files are transcribed using AWS Transcribe and segmented for search.
upload_image_url
Get a presigned URL to upload an image (step 1 of image upload workflow).
| Parameter | Type | Required | Description |
|---|---|---|---|
filename |
string | Yes | Name of the image file (e.g., 'photo.jpg') |
Supported formats: JPEG, PNG, GIF, WebP, AVIF, BMP, TIFF
generate_image_caption
Generate an AI caption for an uploaded image using a vision model (step 2, optional).
| Parameter | Type | Required | Description |
|---|---|---|---|
s3_uri |
string | Yes | S3 URI returned by upload_image_url |
submit_image
Finalize an image upload and trigger indexing (step 3).
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
image_id |
string | Yes | - | Image ID from upload_image_url |
caption |
string | No | null | Primary caption |
user_caption |
string | No | null | User-provided caption |
ai_caption |
string | No | null | AI-generated caption |
Configuration Tools (Read-Only)
get_configuration
Get all current RAGStack configuration settings organized by category.
Returns settings for:
- Chat: Models, quotas, system prompt, document access
- Metadata Extraction: Enabled, model, mode (auto/manual), max keys
- Query-Time Filtering: Filter generation, multi-slice retrieval settings
- Public Access: Which endpoints allow unauthenticated access
- Document Processing: OCR backend, image caption prompt
- Media Processing: Transcribe language, speaker diarization, segment duration
- Budget: Alert thresholds
Note: Read-only. To modify settings, use the admin dashboard (Cognito auth required).
Metadata Analysis Tools
These tools help understand and optimize metadata extraction and filtering.
get_metadata_stats
Get statistics about metadata keys extracted from documents.
Returns key names, data types, occurrence counts, sample values, and status.
get_filter_examples
Get AI-generated filter examples for metadata-based search queries.
Returns filter patterns with name, description, use case, and JSON filter syntax.
Filter syntax reference:
- Basic operators:
$eq,$ne,$gt,$gte,$lt,$lte,$in,$nin,$exists - Logical operators:
$and,$or - Example:
{"topic": {"$eq": "genealogy"}}
get_key_library
Get the complete metadata key library with all discovered keys.
Returns all keys available for filtering with data types and sample values.
check_key_similarity
Check if a proposed metadata key is similar to existing keys.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
key_name |
string | Yes | - | Proposed key name to check |
threshold |
float | No | 0.8 | Similarity threshold (0.0-1.0) |
Use this before adding documents with new keys to avoid duplicates.
analyze_metadata
Trigger metadata analysis to discover keys and generate filter examples.
Note: This is a long-running operation (1-2 minutes). It samples up to 1000 vectors and uses LLM analysis.
Run this after ingesting new documents or when filter generation isn't working as expected.
Usage Examples
Once configured, just ask your AI assistant naturally:
Search & Chat:
- "Search my knowledge base for authentication best practices"
- "What does our documentation say about API rate limits?"
- "What was discussed in the team meeting about deadlines?" (searches video/audio transcripts)
Web Scraping:
- "Scrape the React docs at react.dev/reference"
- "Check the status of my scrape job"
Document, Image & Media Upload:
- "Upload a new document called quarterly-report.pdf"
- "Upload this image and generate a caption for it"
- "Upload the meeting recording meeting-2024-01.mp4"
Metadata Analysis:
- "What metadata keys are available for filtering?"
- "Analyze the metadata in my knowledge base"
- "Show me the filter examples"
- "Check if 'author' is similar to any existing keys"
Configuration:
- "What are my current RAGStack settings?"
- "What model is being used for chat?"
- "Is multi-slice retrieval enabled?"
- "What are my quota limits?"
- "What language is configured for transcription?"
Environment Variables
| Variable | Required | Description |
|---|---|---|
RAGSTACK_GRAPHQL_ENDPOINT |
Yes | Your RAGStack GraphQL API URL |
RAGSTACK_API_KEY |
Yes | Your RAGStack API key |
Development
# Clone and install
cd src/ragstack-mcp
uv sync
# Run locally
uv run ragstack-mcp
# Build package
uv build
# Publish to PyPI
uv publish
License
MIT

