Embedding Providers

The S3 Documentation MCP server supports two embedding providers: Ollama (local) and OpenAI (cloud).

Comparison

Feature	Ollama	OpenAI
Cost	Free	~$0.00002/1K tokens
Privacy	100% local	Data sent to OpenAI
Offline	✅ Yes	❌ No
Accuracy	Good	Excellent
Multilingual	Good	Excellent
Setup	Install + model download	API key only
Resources	Local CPU/GPU	Cloud-based

Ollama (Local)

Recommended for: Local development, privacy-conscious deployments, offline usage

Setup

Install Ollama from https://ollama.ai
Pull the embedding model:
Terminal window
```
ollama pull nomic-embed-text
```

Configure in .env:

EMBEDDING_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_EMBEDDING_MODEL=nomic-embed-text

Docker Configuration

When using Docker, use host.docker.internal to access Ollama running on the host:

OLLAMA_BASE_URL=http://host.docker.internal:11434

Or in docker-compose.yml:

environment:
  - OLLAMA_BASE_URL=http://host.docker.internal:11434

Pros

✅ Free: No API costs, unlimited usage
✅ Private: All data stays on your machine
✅ Offline: Works without internet connection
✅ Fast: Direct local API calls
✅ No Rate Limits: Process as much as you want

Cons

⚠️ Requires Ollama installation and model download (~270MB)
⚠️ Uses local CPU/GPU resources
⚠️ Slightly lower accuracy than cloud models

About nomic-embed-text

The nomic-embed-text model is:

Dimension: 768
Size: ~270MB
Performance: Excellent for English, good for other languages
Speed: Very fast on modern CPUs
License: Apache 2.0 (fully open-source)

OpenAI (Cloud)

Recommended for: Production deployments, multilingual content, maximum accuracy

Setup

Get an API key from OpenAI Platform
Add credits to your account

Configure in .env:

EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=sk-...your-key...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

Model Options

text-embedding-3-small (Recommended)

OPENAI_EMBEDDING_MODEL=text-embedding-3-small

Dimensions: 1536
Cost: ~$0.00002/1K tokens
Performance: High accuracy
Best for: Most use cases, cost-sensitive deployments

text-embedding-3-large

OPENAI_EMBEDDING_MODEL=text-embedding-3-large

Dimensions: 3072
Cost: ~$0.00013/1K tokens
Performance: Maximum accuracy
Best for: Multilingual content, maximum precision

Pros

✅ High Accuracy: State-of-the-art embeddings
✅ Multilingual: Excellent support for 20+ languages
✅ No Local Resources: Runs entirely in the cloud
✅ Lower Latency: Fast API responses
✅ Scalable: No local hardware limits

Cons

⚠️ Requires API key and credits
⚠️ Data sent to OpenAI servers
⚠️ Requires internet connection
⚠️ Rate limits apply (though very generous)

Cost Estimation

Typical documentation indexing costs:

Documentation Size	Tokens (approx.)	Cost (text-embedding-3-small)
100 pages	~250K tokens	~$0.005
500 pages	~1.25M tokens	~$0.025
1000 pages	~2.5M tokens	~$0.05

Search costs are negligible (~$0.00001 per query).

Fallback Behavior

If you set EMBEDDING_PROVIDER=openai but don’t provide a valid OPENAI_API_KEY, the server will:

⚠️ Log a warning
🔄 Automatically fall back to Ollama (if configured)
❌ Fail to start if neither provider is available

This ensures the server can always start with a working configuration.

Switching Providers

When switching between providers, you must rebuild your vector index because embeddings are not compatible:

# Delete existing index
rm -rf ./data/hnswlib-store

# Restart the server (will rebuild with new provider)
npm start  # or docker-compose restart

Performance Comparison

Real-world performance on a typical documentation set (500 pages):

Provider	Indexing Time	Search Time	Accuracy
Ollama (nomic-embed-text)	~5 min	~50ms	Good ⭐⭐⭐⭐
OpenAI (text-embedding-3-small)	~2 min	~100ms	Excellent ⭐⭐⭐⭐⭐
OpenAI (text-embedding-3-large)	~2 min	~100ms	Best ⭐⭐⭐⭐⭐

Times measured on: M1 MacBook Pro (Ollama), standard network connection (OpenAI)

Recommendations

For Local Development

EMBEDDING_PROVIDER=ollama

Fast, free, and private. Perfect for testing and iteration.

For Production (English-only)

EMBEDDING_PROVIDER=openai
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

Great accuracy at minimal cost.

For Production (Multilingual)

EMBEDDING_PROVIDER=openai
OPENAI_EMBEDDING_MODEL=text-embedding-3-large

Maximum accuracy across all languages.

For Privacy-Critical Deployments

EMBEDDING_PROVIDER=ollama

Keep all data on-premises.

Embedding Providers

Comparison

Ollama (Local)

Setup

Docker Configuration

Pros

Cons

About nomic-embed-text

OpenAI (Cloud)

Setup

Model Options

text-embedding-3-small (Recommended)

text-embedding-3-large

Pros

Cons

Cost Estimation

Fallback Behavior

Switching Providers

Performance Comparison

Recommendations

For Local Development

For Production (English-only)

For Production (Multilingual)

For Privacy-Critical Deployments

Next Steps