Troubleshooting & AI providers

Most problems are one of two things: no AI provider configured, or the wrong model name. This page covers each provider — Claude Code, Ollama, Gemini, OpenAI — how to set one up (including a free Gemini key), which model to use, and the errors you're most likely to see.

How providers are selected

The app answers with one provider at a time, chosen by RAG_PROVIDER (claude · ollama · gemini · openai). The default is claude — no API key. Set values in a .env file at the repo root (copy .env.example first), or as environment variables, then restart the app. In the web UI you can also pick a provider from the dropdown for a single question.

cp .env.example .env        # then edit .env
RAG_PROVIDER=ollama         # claude | ollama | gemini | openai

Polyglot note: the Node.js and C# ports of Lesson 1 support claude and ollama. For gemini / openai, use the Python reference (./run -l 1).

Juggling these providers and keys across several projects or machines? agentvault manages Claude / Ollama / Gemini / OpenAI configs and rules in one place.

Ollama (fully local, no key)

Runs models on your own machine. Install it from ollama.com, then pull a model. (Prefer one command to pull + run any Ollama model? ai-runner wraps exactly that.)

Error: `404 (Not Found)` / “model … not found”

Ollama is running, but the model you asked for isn't installed. The default is llama3.1:8b. Either pull it, or point the app at a model you already have:

ollama list                 # what you already have
ollama pull llama3.1:8b     # download the default model (~4.9 GB)

# ...or use an installed model without pulling:
#   macOS / Linux:
export OLLAMA_MODEL=qwen2.5:7b
#   Windows PowerShell:
$env:OLLAMA_MODEL = "qwen2.5:7b"

To make it permanent, set OLLAMA_MODEL in .env.

Error: connection refused / “Ollama unavailable”

The Ollama server isn't running or is on a different address. Start it and check the URL:

ollama serve                            # start the server
curl http://localhost:11434/api/tags    # should list your models
# non-default host/port? set it:
export OLLAMA_URL=http://localhost:11434

Which model?

Use	Model	Pull
Chat / Q&A (recommended)	`llama3.1:8b`	`ollama pull llama3.1:8b`
Smaller / faster	`qwen2.5:7b`, `llama3.2:1b`	`ollama pull qwen2.5:7b`
Tool / function calling (Lesson 9)	`llama3.1`, `qwen2.5`, `mistral-nemo`	`ollama pull llama3.1`
Embeddings (`RAG_RETRIEVER=embeddings`)	`nomic-embed-text`	`ollama pull nomic-embed-text`

Claude Code (default, no API key)

Uses your existing Claude Code CLI login, so there's nothing to configure.

Error: “Claude Code CLI 'claude' not found on PATH”

Install the CLI and sign in once:

npm install -g @anthropic-ai/claude-code
claude                 # first run signs you in
claude --version

Custom install path? Point the app at it with CLAUDE_BIN=/full/path/to/claude. No API key is needed — it reuses your Claude login. Claude Code can't produce embeddings, so for RAG_RETRIEVER=embeddings use Ollama, Gemini, or OpenAI as the embed provider.

Google Gemini (free tier)

Google's hosted models, with a generous free tier — a good no-cost cloud option.

Open a free account & get an API key

Go to aistudio.google.com/app/apikey (Google AI Studio).
Sign in with any Google account — the free tier needs no billing / no credit card.
Click Create API key (a new Google Cloud project is created for you), then copy the key.
Put it in .env and select Gemini:

RAG_PROVIDER=gemini
GEMINI_API_KEY=AIza...your-key...
GEMINI_MODEL=gemini-2.5-flash        # fast & free-tier friendly (default)
GEMINI_EMBED_MODEL=text-embedding-004

Which model? gemini-2.5-flash is the default — fast and free-tier friendly. For higher quality use gemini-2.5-pro. Embeddings use text-embedding-004.

Common errors

You see	Cause & fix
`400 API key not valid` / `403`	Wrong or missing `GEMINI_API_KEY`. Re-copy it from AI Studio (no quotes, no spaces).
`404 model not found`	`GEMINI_MODEL` name is wrong or unavailable to your key. Use `gemini-2.5-flash`.
`429` / quota exceeded	Free-tier rate limit hit — wait a minute and retry, or slow down requests.

OpenAI (or any OpenAI-compatible endpoint)

Hosted models from OpenAI. Note the API is paid (it needs billing set up), separate from a ChatGPT subscription.

Get an API key

Go to platform.openai.com/api-keys and sign in.
Add a payment method under Billing (required before the API will answer).
Create a secret key, copy it, and add it to .env:

RAG_PROVIDER=openai
OPENAI_API_KEY=sk-...your-key...
OPENAI_MODEL=gpt-4o-mini             # cheap & capable default
OPENAI_EMBED_MODEL=text-embedding-3-small
# Local/compatible server (LM Studio, vLLM, etc.)? override the base URL:
OPENAI_BASE_URL=https://api.openai.com/v1

Common errors

You see	Cause & fix
`401 Incorrect API key`	Wrong/expired `OPENAI_API_KEY`. Generate a new key.
`429 insufficient_quota`	No billing / credit on the account. Add a payment method.
`404` on a custom endpoint	`OPENAI_BASE_URL` is wrong, or that server doesn't expose the requested model.

Other common issues

Symptom	Fix
Answer says “not covered in your documents”	That's correct behaviour when the answer isn't in your files — it's the anti-hallucination guard. Add relevant documents to `documents/`.
No sources / “nothing relevant found”	The corpus is empty or the question doesn't match. Drop files into `documents/` (PDF/DOCX/TXT/MD); the index auto-refreshes.
Embeddings mode falls back to BM25	The embed provider was unreachable. Set `RAG_EMBED_PROVIDER` to ollama/gemini/openai and ensure it's configured.
Web UI won't start / port in use	`./run` auto-picks a free port. To force one: `WEB_PORT=5050 ./run -l 1`.
Changed `.env` but nothing changed	Restart the app — config is read at startup.

Environment variables (quick reference)

Variable	Default	What it does
`RAG_PROVIDER`	`claude`	Which AI answers: claude · ollama · gemini · openai
`RAG_RETRIEVER`	`bm25`	Retrieval: bm25 (keyword) · embeddings (semantic)
`OLLAMA_URL`	`http://localhost:11434`	Ollama server address
`OLLAMA_MODEL`	`llama3.1:8b`	Ollama chat model (must be pulled)
`GEMINI_API_KEY`	—	Gemini key (from AI Studio)
`GEMINI_MODEL`	`gemini-2.5-flash`	Gemini chat model
`OPENAI_API_KEY`	—	OpenAI key (needs billing)
`OPENAI_MODEL`	`gpt-4o-mini`	OpenAI chat model
`CLAUDE_BIN`	`claude`	Path to the Claude Code CLI

Back to Lesson 1 → Install guide (PDF) GitHub

Troubleshooting & AI providers

How providers are selected

Ollama (fully local, no key)

Error: 404 (Not Found) / “model … not found”

Error: connection refused / “Ollama unavailable”

Which model?

Claude Code (default, no API key)

Error: “Claude Code CLI 'claude' not found on PATH”

Google Gemini (free tier)

Open a free account & get an API key

Common errors

OpenAI (or any OpenAI-compatible endpoint)

Get an API key

Common errors

Other common issues

Environment variables (quick reference)

Error: `404 (Not Found)` / “model … not found”