Lesson 2

Build an MCP server

Expose the document search from Lesson 1 as a Model Context Protocol tool, so Claude Code can query your documents/ folder natively — no copy-paste. Build it in PythonNode.jsC#, each on the official MCP SDK.

Follow along in:

Overview

What you'll build & the big flip

In Lesson 1, your app drove the pipeline and called the LLM. In Lesson 2 the relationship flips: the LLM drives, and your retriever becomes a tool it reaches for on demand. Same engine, new integration surface.

   ┌──────────────┐   list / call tools       ┌──────────────────────────┐
   │ Claude Code  │ ───────────────────────▶  │ local-ai-lab MCP server  │
   │ (MCP host)   │ ◀───────────────────────  │   search_docs,           │
   └──────┬───────┘   results (cited)          │   list_documents         │
          │                                    └─────────────┬────────────┘
          │  "how do I reset the device?"                    │ reuses Lesson 1
          ▼                                                  ▼
   grounded answer with citations                    documents/ + retriever

Why it matters: RAG you can call from any MCP-aware client, with no bespoke UI. Your documents become a first-class capability of the assistant itself. Press → to begin.

Concept

What is MCP?

The Model Context Protocol is an open standard that lets an AI client discover and call external tools over JSON-RPC. A server advertises tools (each with a name, description, and input schema); the host (Claude Code) lists them, and the model calls them when useful, feeding results back into its answer.

Three core ideas: tools (functions the model can call), the transport (we use stdio — the host launches your server as a subprocess and talks over stdin/stdout), and the handshake (initialize → list tools → call tools). The official SDK handles all the protocol plumbing, so you just write functions: Python's FastMCP turns decorated functions into tools.Node's McpServer.registerTool turns a handler + Zod schema into a tool.C#'s [McpServerTool] attribute turns a method into a tool.

Setup

Prerequisites

Finish Lesson 1 first — the MCP server is a thin wrapper over the retriever you built there. Then add the official MCP SDK for your language:

Type this

pip install mcp        # the official Model Context Protocol Python SDK

./run installs this for you on first use

cd node/lesson-2 && npm install    # @modelcontextprotocol/sdk + zod

./run restores this for you on first use

cd dotnet/lesson-2 && dotnet restore    # ModelContextProtocol 1.4.0

Every SDK ships both a server API and an stdio client we'll use to test. You also need the Claude Code CLI to register the server at the end.

Install

Dependencies — Linux · macOS · Windows

Lesson 2 builds on Lesson 1's environment and adds the official MCP SDK for your language. No Docker. The repo-root ./run -l 2 --lang … dispatcher sets everything up on first use.

Linux / macOS

python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt          # includes mcp

Windows (PowerShell)

python -m venv venv; venv\Scripts\Activate.ps1
pip install -r requirements.txt

You only need Node.js 18+

node --version                  # confirm 18+
./run -l 2 --lang node test     # installs the SDK on first use, then drives the server

You only need the .NET 8 SDK

dotnet --version                # confirm 8.x
./run -l 2 --lang csharp test   # restores + builds on first use, then drives the server

Claude Code CLI (to register the server)

npm install -g @anthropic-ai/claude-code && claude

MCP is multi-language — there are official SDKs for Python, Node.js and C#, and this lesson ships a working server in all three. Pick a language above; full per-OS setup is in INSTALL (PDF).

Step 1

The server skeleton

Create mcp_server.py. FastMCP gives you a server object; tools are just decorated functions. Their docstring becomes the description the model sees, and the type hints become the input schema — so write them for the model to read.

mcp_server.py

from mcp.server.fastmcp import FastMCP

from localrag.config import load_config
from localrag.engine import get_retriever
from localrag.extract import discover_files

mcp = FastMCP("local-ai-lab-docs")

# ... tools go here ...

def main():
    mcp.run()        # default transport is stdio

if __name__ == "__main__":
    main()

node/lesson-2/src/server.js

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

// Reuse the Lesson 1 engine, module-for-module.
import { loadConfig } from "../../lesson-1/src/config.js";
import { getRetriever } from "../../lesson-1/src/engine.js";
import { discoverFiles } from "../../lesson-1/src/extract.js";

const server = new McpServer({ name: "local-ai-lab-docs-node", version: "1.0.0" });

// ... tools go here ...

const transport = new StdioServerTransport();   // default transport is stdio
await server.connect(transport);

dotnet/lesson-2/Program.cs

using Microsoft.Extensions.Hosting;
using ModelContextProtocol.Server;
using LocalRag;   // reuse Lesson 1's Config / Store / Retriever / Extract

// stdout is the JSON-RPC stream; send stray Console output to stderr instead.
Console.SetOut(Console.Error);
// Build WITHOUT the CLI args so the positional action token can't break startup.
var builder = Host.CreateApplicationBuilder();
// Route logs to stderr so they never corrupt the JSON-RPC stream on stdout.
builder.Logging.AddConsole(o => o.LogToStandardErrorThreshold = LogLevel.Trace);
builder.Services
    .AddMcpServer()
    .WithStdioServerTransport()   // default transport is stdio
    .WithToolsFromAssembly();
await builder.Build().RunAsync();

Notice the imports: the config, retriever, and file-discovery helpers all come straight from Lesson 1 (imported across from node/lesson-1) (Lesson 1's .cs files are compiled into this project). MCP is a new doorway onto the same engine.

Step 2

The `search_docs` tool

The star of the show. It runs the Lesson 1 retriever and returns passages tagged [source:page] so the model can cite them.

mcp_server.py — add the tool

@mcp.tool()
def search_docs(query: str, k: int = 5) -> str:
    """Search the user's local documents and return the most relevant passages.

    Each passage is prefixed with its source as [filename:page] so the model
    can cite it. Call this to ground answers in the user's own files instead
    of relying on training data.
    """
    config = load_config()
    hits = get_retriever(config).search(query, max(1, int(k)))
    if not hits:
        return "No relevant passages found in the local documents."
    return "\n\n".join(
        f"[{h['source']}:{h['page_number']}] {h['text']}" for h in hits)

server.js — add the tool

server.registerTool("search_docs", {
  // This description is a prompt — the model reads it to decide when to call.
  description:
    "Search the user's local documents and return the most relevant passages. " +
    "Each passage is prefixed with its source as [filename:page] so the model " +
    "can cite it. Call this to ground answers in the user's own files instead " +
    "of relying on training data.",
  inputSchema: z.object({ query: z.string(), k: z.number().int().optional() }),
}, async ({ query, k }) => {
  const config = loadConfig();
  const hits = (await getRetriever(config)).search(query, Math.max(1, k ?? 5));
  const text = hits.length
    ? hits.map((h) => `[${h.source}:${h.page_number}] ${h.text}`).join("\n\n")
    : "No relevant passages found in the local documents.";
  return { content: [{ type: "text", text }] };
});

Program.cs — add the tool

[McpServerToolType]
public static class DocTools
{
    // The [Description] attributes ARE the prompt the model reads.
    [McpServerTool(Name = "search_docs"), Description(
        "Search the user's local documents and return the most relevant passages. " +
        "Each passage is prefixed with its source as [filename:page] so the model " +
        "can cite it. Call this to ground answers in the user's own files instead " +
        "of relying on training data.")]
    public static string SearchDocs(string query, int k = 5)
    {
        var config = Config.Load();
        var hits = GetRetriever(config).Search(query, Math.Max(1, k));
        if (hits.Count == 0)
            return "No relevant passages found in the local documents.";
        return string.Join("\n\n",
            hits.Select(h => $"[{h.Source}:{h.PageNumber}] {h.Text}"));
    }
}

The description is a prompt. Whether it's a docstringdescription string[Description] attribute, the model reads it to decide when to call the tool, so it explicitly says "to ground answers… instead of relying on training data." Good tool descriptions are as important as good code.

Step 3

A second tool: `list_documents`

Servers usually expose more than one tool. This one lets the model see what's in the corpus before searching — handy for "what do you have on X?" questions.

mcp_server.py — add another tool

@mcp.tool()
def list_documents() -> str:
    """List the documents currently available to search in the local corpus."""
    config = load_config()
    names = [p.name for p in discover_files(config.docs_dir)]
    return "\n".join(names) if names else "(no documents indexed yet)"

server.js — add another tool

server.registerTool("list_documents", {
  description: "List the documents currently available to search in the local corpus.",
  inputSchema: z.object({}),
}, async () => {
  const config = loadConfig();
  const names = discoverFiles(config.docsDir).map((p) => path.basename(p));
  const text = names.length ? names.join("\n") : "(no documents indexed yet)";
  return { content: [{ type: "text", text }] };
});

Program.cs — add another tool (inside DocTools)

[McpServerTool(Name = "list_documents"), Description(
    "List the documents currently available to search in the local corpus.")]
public static string ListDocuments()
{
    var config = Config.Load();
    var names = Extract.DiscoverFiles(config.DocsDir).Select(Path.GetFileName).ToList();
    return names.Count > 0 ? string.Join("\n", names) : "(no documents indexed yet)";
}

Tool design tip: keep each tool small and single-purpose with a clear name. The model composes them — it might call list_documents, then search_docs — just like you'd chain functions.

Step 4

Run it over stdio

The server defaults to the stdio transport: it reads JSON-RPC from stdin and writes to stdout. That's exactly how a host like Claude Code launches a local server — as a subprocess it talks to over pipes.

Type this

python mcp_server.py          # waits silently for an MCP client to connect

Type this

./run -l 2 --lang node serve  # waits silently for an MCP client to connect

Type this

./run -l 2 --lang csharp serve  # waits silently for an MCP client to connect

It looks like it's hanging — that's correct. The server is waiting for a client to speak the protocol over stdin. You won't talk to it by hand; the next step uses a client to drive it.

Step 5

Test it with an stdio client

The SDK includes a client. This spawns the server, does the handshake, lists tools, and calls search_docs — a real integration test, no LLM needed.

tests/test_mcp.py (essence)

from mcp import ClientSession
from mcp.client.stdio import StdioServerParameters, stdio_client

async def run():
    params = StdioServerParameters(command=sys.executable,
                                   args=["mcp_server.py"], cwd=str(ROOT))
    async with stdio_client(params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = await session.list_tools()
            result = await session.call_tool(
                "search_docs", {"query": "how do I reset the device", "k": 3})
            text = "".join(c.text for c in result.content)
            return [t.name for t in tools.tools], text

Run it

pytest -q tests/test_mcp.py     # passes: tools listed, passage returned

node/lesson-2/src/demo.js (essence)

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";

const transport = new StdioClientTransport({
  command: process.execPath, args: ["src/server.js"],
});
const client = new Client({ name: "demo", version: "1.0.0" });
await client.connect(transport);                       // initialize
const tools = await client.listTools();                // list
const res = await client.callTool({ name: "search_docs",
  arguments: { query: "how do I reset the device", k: 3 } });   // call

Run it

./run -l 2 --lang node test     # spawns the server, lists tools, calls search_docs

Program.cs — the demo client (essence)

using ModelContextProtocol.Client;

var transport = new StdioClientTransport(new StdioClientTransportOptions {
    Command = "dotnet", Arguments = [dll, "serve"],
});
await using var client = await McpClient.CreateAsync(transport); // initialize (async-disposed)
var tools  = await client.ListToolsAsync();            // list
var res    = await client.CallToolAsync("search_docs", // call
    new Dictionary<string, object?> { ["query"] = "how do I reset the device", ["k"] = 3 });
var text = res.Content.OfType<TextContentBlock>().First().Text;

Run it

./run -l 2 --lang csharp test   # spawns the server, lists tools, calls search_docs

The handshake in code: initialize → list tools → call tool. That's the entire MCP lifecycle, and it's the same three steps in every language. The result contains "power button" and sample_manual.md — grounded, cited, verified, no LLM needed.

Step 6

Register it with Claude Code

Now hand the server to a real host. From the repo directory, register it with one command:

Type this

claude mcp add local-ai-lab-docs -- python mcp_server.py
claude mcp list                         # confirm it's registered

Type this

claude mcp add local-ai-lab-docs-node -- node node/lesson-2/src/server.js
claude mcp list                         # confirm it's registered

Type this

claude mcp add local-ai-lab-docs-dotnet -- dotnet dotnet/lesson-2/bin/Release/net8.0/LocalRagMcp.dll serve
claude mcp list                         # confirm it's registered

That tells Claude Code: "when you start, launch this server and treat its tools as your own." Prefer absolute paths if you run Claude Code from elsewhere. Each language registers under a distinct name, so the three servers can coexist. Equivalent to adding it to your MCP config JSON by hand.

Step 7

See it work — RAG, native to the assistant

Open Claude Code in the repo and just ask. The model will call search_docs against your documents/ folder and answer with citations — no copy-paste, no custom UI.

In Claude Code chat

You:    How do I reset the device?

Claude: (calls search_docs "reset device")
        Hold the power button for 10 seconds until the LED blinks blue
        three times. [sample_manual.md:1]

That's the payoff: your local, private documents are now a tool the assistant uses on its own initiative. The same retriever, reachable from any MCP host.

Recap

You built a working MCP server

Piece	What it does
`FastMCP("…")new McpServer(…)AddMcpServer()`	the server; handles all protocol plumbing
`@mcp.tool()registerTool(…)[McpServerTool]`	turns a function into a callable tool (description = prompt, types = schema)
`search_docs` / `list_documents`	your tools, reusing the Lesson 1 engine
`mcp.run()server.connect(stdio)WithStdioServerTransport()`	serves over stdio for the host to launch
`claude mcp add`	registers it so Claude Code can call it

The through-line: search_docs is the same capability you'll rebuild in every later lesson — as an Ollama function call, a Semantic Kernel plugin, a Bedrock action group, and a Google ADK tool. Master the primitive once; the frameworks are just wrappers.

← Lesson 1: RAG See mcp_server.py See server.js See Program.cs

Use ← → arrow keys, the dots, or the buttons. Pick a language above; your choice sticks across every step.

Build an MCP server

What you'll build & the big flip

What is MCP?

Prerequisites

Dependencies — Linux · macOS · Windows

The server skeleton

The search_docs tool

A second tool: list_documents

Run it over stdio

Test it with an stdio client

Register it with Claude Code

See it work — RAG, native to the assistant

You built a working MCP server

The `search_docs` tool

A second tool: `list_documents`