Gemini (Free Cloud LLM + Embeddings)¶

Use Google Gemini as a drop-in replacement for local Ollama. No GPU required, no Ollama process to manage. Works with the free tier.

When to use this¶

EMBEDDING_PROVIDER=gemini
DISTILLER_PROVIDER=gemini
GEMINI_API_KEY=your-key-here

Storage stays local (SQLite + LanceDB). Ollama is no longer needed.

Component	Ollama (default)	Gemini
Embeddings	`nomic-embed-text` (local)	`text-embedding-004` (768-dim, Google API)
Distillation	`gemma3:4b` (local)	`gemini-2.0-flash` (Google API)
Storage	unchanged	unchanged
Privacy	Raw text stays on device	Raw text goes to Google
Cost	$0 (needs Ollama running)	$0 (free tier)

EMBEDDING_MODEL=text-embedding-004    # default for gemini provider
LLM_MODEL=gemini-2.0-flash           # default for gemini provider

EMBEDDING_MODEL and LLM_MODEL are universal — they apply to whichever provider is active. Each provider has sensible defaults if you omit them.

Embedding and distillation providers are independent. You can use Gemini for one and Ollama for the other:

# Gemini embeddings + local Ollama distillation (privacy-preserving)
EMBEDDING_PROVIDER=gemini
DISTILLER_PROVIDER=ollama
GEMINI_API_KEY=your-key-here

Or combine with PostgreSQL storage:

BACKEND=postgres
DATABASE_URL=postgresql://...
EMBEDDING_PROVIDER=gemini
DISTILLER_PROVIDER=gemini
GEMINI_API_KEY=your-key-here

As of March 2026, the Gemini free tier provides:

For typical team memory usage (a few remember + search calls per minute), the free tier is more than sufficient.

Set providers back to ollama (or remove EMBEDDING_PROVIDER / DISTILLER_PROVIDER — Ollama is the default).