🌐 MCP Server
HTTP server implementing the MCP protocol. Routes requests to tools and resources.
Learn about the internal architecture and components of the S3 Documentation MCP server.
The system consists of 6 main components working together to provide RAG capabilities:
🌐 MCP Server
HTTP server implementing the MCP protocol. Routes requests to tools and resources.
☁️ S3Loader
Manages S3 communication. Lists, downloads, and tracks file changes via ETags.
🔄 SyncService
Orchestrates synchronization between S3 and vector store. Detects changes.
🧠 VectorStore
Chunks documents, generates embeddings, stores vectors, performs similarity search.
📡 Embedding Providers
Generate vector embeddings via Ollama (local) or OpenAI (cloud).
📚 ResourceService
Manages MCP Resources API for file discovery and direct access.
MCP Client → MCP Server → [S3Loader ↔ S3 Bucket] ↓ SyncService → VectorStore → HNSWLib Index ↓ Embedding Provider (Ollama / OpenAI)The main HTTP server that implements the Model Context Protocol.
Responsibilities:
Technologies:
@modelcontextprotocol/sdk for MCP protocolEndpoints:
POST /mcp - MCP protocol endpointGET /health - Health checkManages communication with S3-compatible storage.
List Files
Scans bucket for .md files and retrieves metadata with ETags
Download Content
Fetches file contents in parallel for efficiency
Track Changes
Uses ETags to detect modifications without re-downloading
Handle Errors
Robust retry logic and error handling
Technologies:
@aws-sdk/client-s3)S3 Sync Flow:
SyncService → S3Loader → ListObjectsV2 → S3 BucketS3Loader compares ETags to detect changesGetObjectSyncServiceOrchestrates the synchronization between S3 and the vector store.
Sync Modes:
When: At server startup
Behavior:
Best for: Most deployments
When: At regular intervals (configurable)
Behavior:
Best for: Frequently updated docs
When: Only on refresh_index tool call
Behavior:
Best for: Development, testing
Incremental Sync Algorithm:
Full Sync Algorithm:
Manages document chunking, embedding generation, and vector similarity search.
Document Processing Pipeline:
Markdown File → Text Splitter → Chunks → Embedding Provider ↓ Vectors (embeddings) ↓ HNSWLib Index ↓ Search ResultsChunking Strategy:
Chunk Size
1000 characters (default)
Configurable via RAG_CHUNK_SIZE
Overlap
200 characters (default)
Prevents context loss at boundaries
Method
Recursive character splitting
Respects markdown structure
Preserves
Markdown formatting
Code blocks, headings, lists intact
Technologies:
hnswlib-node for vector indexinglangchain for document processingIndex Storage:
./data/hnswlib-store/├── args.json # Index configuration├── docstore.json # Document metadata└── hnswlib.index # Vector index (binary)What is HNSWLib?
HNSWLib (Hierarchical Navigable Small World) is a fast, in-memory vector search library:
Generate vector embeddings for text chunks.
Configuration:
http://localhost:11434/api/embeddingsnomic-embed-textAPI Flow:
VectorStore sends text to OllamaProviderOllamaProvider calls POST /api/embeddingsAdvantages:
Configuration:
https://api.openai.com/v1/embeddingstext-embedding-3-small (1536) or text-embedding-3-large (3072)API Flow:
VectorStore sends text to OpenAIProviderOpenAIProvider calls POST /v1/embeddingsAdvantages:
Cost:
text-embedding-3-small: ~$0.00002/1K tokenstext-embedding-3-large: ~$0.00013/1K tokensManages the MCP Resources API for file discovery and access.
Capabilities:
List Resources
Returns all indexed files with metadata (size, chunks, modified date)
Generate URIs
Creates unique s3doc:// URIs for each file
Provide Descriptions
Human-readable metadata for each resource
Read Files
Retrieves full file contents by URI
Resource Format:
{ "uri": "s3doc://docs/file.md", "name": "file.md", "description": "File Name - Size: X KB, Chunks: Y, Modified: Z", "mimeType": "text/markdown"}search_documentation with queryTypical Latency:
refresh_index or automatic sync triggersListObjectsV2GetObjectTypical Sync Times:
| Files | Total Size | Ollama | OpenAI |
|---|---|---|---|
| 100 | 5 MB | ~1 min | ~30 sec |
| 500 | 25 MB | ~5 min | ~2 min |
| 1000 | 50 MB | ~10 min | ~4 min |
Measured on M1 MacBook Pro
Embedding Generation
10-50ms
Varies by provider
Vector Search
10-30ms
HNSWLib lookup
Total Latency
50-100ms
End-to-end search time
| Component | 100 files | 1000 files |
|---|---|---|
| Vector index | ~5 MB | ~50 MB |
| Docstore metadata | ~100 KB | ~1 MB |
| Total | ~5 MB | ~51 MB |
Target Scale
Personal use, small teams
< 5000 files
Storage
File-based (HNSWLib)
Simple and portable
Search
In-memory vectors
Very fast lookups
Concurrency
Multiple searches
Handles concurrent users
If you need enterprise scale:
Node.js
>= 18
JavaScript runtime
TypeScript
Full typing
Type safety throughout
@modelcontextprotocol/sdk - MCP protocol implementationexpress - HTTP server and routing@aws-sdk/client-s3 - S3 client libraryhnswlib-node - Fast vector indexing and searchlangchain - Document processing and chunkingollama - Local embedding generationopenai - Cloud embedding APIvitest - Fast unit testingeslint - Code lintingprettier - Code formattingtypescript - Type checkingWhen ENABLE_AUTH=true:
Authorization: Bearer <key> header, or?api_key=<key> query parameter401 Unauthorized/health always accessible.env fileINFO
Normal operations
Sync, search, startup
WARN
Non-critical issues
Fallbacks, deprecations
ERROR
Critical failures
S3 errors, sync failures
curl http://localhost:3000/healthReturns:
Currently logged but not exported:
Simplicity
File-based storage, no database setup required
Performance
Fast approximate nearest neighbor search
Portability
Easy to backup, version, and migrate
Cost
No cloud database costs
Flexibility: Users choose based on their needs:
No vendor lock-in. Switch anytime.
Potential improvements (not implemented):
📊 Metrics Export
Prometheus metrics for monitoring
🔍 Query Analytics
Track popular queries and patterns
🔄 Webhook Support
React to S3 events in real-time
🌐 Multi-language
Better non-English support
🎯 Relevance Feedback
Learn from user feedback
📈 Usage Dashboards
Visual analytics and insights
API Reference
Complete HTTP and MCP API documentation
MCP Tools
Learn about the 3 MCP tools available
Configuration
Environment variables and settings