# Context7-style Docs MCP System A self-hosted, local-compatible documentation retrieval and search system using Docker. This project uses Qdrant for vector embeddings and SQLite for metadata storage, exposing a FastAPI docs backend and an MCP server for IDE/tool integration. ## 🏠 Home Server / Production Use This section covers hardening recommendations for running this system on a home server or in production. ### Environment Variables (`.env`) Copy `.env.example` to `.env` and configure: ```bash cp .env.example .env ``` | Variable | Description | Example | |----------|-------------|---------| | `HOST_PORT` | Docs API host port (default: 8787) | `8787` | | `MCP_HOST_PORT` | MCP server host port (default: 8788) | `8788` | | `DOCS_API_KEY` | API key for docs-api authentication (optional) | `my-secret-key-123` | | `MCP_API_KEY` | API key for MCP server authentication (optional, FastMCP handles via --key flag conceptually) | `mcp-secret-key` | | `DOCS_PATH` | Path to documentation files inside container | `/docs` | | `DB_PATH` | SQLite database path inside container | `/data/db.sqlite` | | `LOG_LEVEL` | Logging level: DEBUG, INFO, WARNING, ERROR | `INFO` | > **Security Note:** API keys are optional. Leave empty in `.env` if you don't need authentication (backward compatible with existing setups). If set, the docs-api requires an `X-API-Key` header matching `DOCS_API_KEY` for protected endpoints. ### Port Configuration For firewall or network setup: ```bash # Example: Run docs-api on port 9000 instead of 8787 HOST_PORT=9000 MCP_HOST_PORT=9001 docker compose up -d --build ``` ### Backup Instructions #### SQLite Database (`data/db.sqlite`) Regular SQLite backups prevent data loss. Example cron job: ```bash # Add to crontab (run daily at 2am) 0 2 * * * docker compose exec docs-api sqlite3 /data/db.sqlite ".backup '/backups/db_$(date +%Y%m%d).sqlite'" ``` Or one-off backup: ```bash docker compose exec docs-api sh -c "sqlite3 /data/db.sqlite '.dump' | gzip > /backups/db-$(date +%Y%m%d-%H%M%S).sql.gz" ``` #### Qdrant Vector Store Qdrant stores vectors in `./data/qdrant`. For backup: ```bash # Backup entire Qdrant data directory docker compose exec qdrant sh -c "tar czf /backups/qdrant-backup-$(date +%Y%m%d).tar.gz /qdrant/storage" # Or pull full export to host (requires volume mount) docker run --rm -v local-context7_data:/data -v $(pwd)/backups:/backups qdrant/qdrant:latest tar czf /backups/qdrant-backup-$(date +%Y%m%d).tar.gz /qdrant/storage ``` ### Rebuild Without Losing Sources or Ingestion Normal image rebuilds preserve Git source definitions, cloned repositories, uploaded documents, SQLite metadata, Qdrant vectors, and the embedding model cache because they are bind-mounted from the host. ```bash git pull docker compose up -d --build ``` Do not delete `data/`, `docs/`, or `docs_sources.yaml`. Do not run the reset commands below unless you intentionally want to erase the indexed data and source configuration. ### Safe Reset Command To reset both SQLite and Qdrant cleanly: ```bash docker compose down -v # Removes volumes and stops services rm ./data/db.sqlite # Remove database file rm -rf ./data/qdrant # Remove Qdrant data docker compose up -d --build ``` Or use the `make reset` command below. ### Makefile Commands The included `Makefile` provides convenient commands: ```bash # Start services make up # Stop services make down # Rebuild and restart make restart # Backup database make backup-db BACKUP_PATH=/backups/db-$(date +%Y%m%d).sqlite.gz # Reset everything (delete volumes) make reset ``` --- ## Architecture ## Architecture ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Client │────▢│ docs-api │◀────│ docs-mcp β”‚ β”‚ (IDE/Tool) β”‚ β”‚ (FastAPI) β”‚ β”‚ (MCP Server)β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Qdrant β”‚ β”‚ (Vector DB) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` **Components:** - `qdrant` β€” Vector database storing document embeddings - `docs-api` β€” FastAPI backend exposing ingestion, search, and library endpoints - `docs-mcp` β€” MCP server providing tools for Context7-style AI interactions ## Prerequisites - Docker Engine v20.10+ - Docker Compose - ~500MB free disk space (Qdrant + embedding model) ## Setup 1. **Download the project** and change into its directory: ```bash cd local-context7 ``` 2. **Copy environment file:** ```bash cp .env.example .env ``` 3. **(Optional) Create sample docs:** ```bash mkdir -p docs/foundryvtt docs/fastapi docs/my-msfs-copilot ``` 4. **Start services:** ```bash docker compose up -d --build ``` 5. **Verify they're running:** ```bash docker compose ps ``` You should see all three services (`qdrant`, `docs-api`, `docs-mcp`) in "Up" status. 6. **Wait for startup completion** (embedding model loads on first API call): ```bash docker compose logs -f docs-api # Watch for "Initialization complete." ``` ## Add Docs Place your documentation folders under the root directory: ```bash mkdir -p docs/foundryvtt/docs cp /path/to/foundryvtt/*.md docs/foundryvtt/docs/ mkdir -p docs/fastapi ``` Supported file types: `.md`, `.txt`, `.py`, `.js`, `.ts`, `.json`, `.yaml`, `.yml`, `.html`, `.css`, `.pdf` (via pypdf). To add new documents to the vector store after adding them, run: ```bash docker compose exec docs-api python -c "from app.ingest import ingest_all; import asyncio; asyncio.run(ingest_all())" ``` Or from another terminal: ```bash curl -X POST http://localhost:8787/api/v1/ingest/all \ -H "Content-Type: application/json" ``` ## Index Docs (Run Ingestion) After adding documents, index them into the vector store: ```bash docker compose exec docs-api python -c "from app.ingest import ingest_all; import asyncio; asyncio.run(ingest_all())" ``` Expected output shows progress like: ``` [Detection] Scanning for libraries in: /docs [Detection] Found 3 library(ies) [Library] Processing: foundryvtt [Library] Scanning for files in: /docs/foundryvtt [Library] Found 5 document(s) ... ``` ## Search Docs ### Via API (POST to `/search`) Request body: ```json { "query": "how do hooks work", "library_id": "foundryvtt", "limit": 10 } ``` Response example: ```json { "query": "hooks", "library_id": "foundryvtt", "results": [ { "id": "...", "score": 0.854, "library_id": "foundryvtt", "path": "core-docs.md", "title": "Core Hooks", "chunk_index": 2 } ], "count": 1 } ``` ### Via MCP (resolve-library-id, search-docs tools) ## Connect MCP Clients To use this system with an MCP-enabled client (e.g., Claude Desktop), configure the MCP server endpoint. ### Example: Claude Desktop Config Add to your `claude_desktop_config.json`: ```json { "mcpServers": { "context7": { "command": "npx", "args": [ "@modelcontextprotocol/server-local-context7", "--url", "http://localhost:8788" ], "env": { "DOCS_API_URL": "http://localhost:8787" } } } } ``` If the client runs outside Docker and can't reach the API, expose them on host ports or run the MCP server outside Docker (see below). ## Example: Cline/Cursor MCP Config For Cursor or similar editors using Cline: ```json // ~/.cursor/mcp.json { "context7": { "type": "stdio", "command": "docker", "args": [ "exec", "-it", "docs-mcp", "uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8788" ] } } ``` Or if exposing MCP on host port: ```json { "context7": { "type": "stdio", "command": "docker", "args": [ "run", "-it", "--rm", "-p", "8788:8788", "--name", "context7-mcp-standalone", "-e", "DOCS_API_URL=http://host.docker.internal:8787", "local-context7/docs-mcp" ] } } ``` ## Troubleshooting ### Services won't start or restart loops Check logs: ```bash docker compose logs -f ``` Common issues: - Port already in use on host β†’ adjust mapping or free the port - Embedding model failing to load β†’ verify disk space, check for GPU constraints if applicable ### Vector search returns empty results Ensure you've run ingestion after adding docs: ```bash docker compose exec docs-api python -c "from app.ingest import ingest_all; import asyncio; asyncio.run(ingest_all())" ``` ### Can't connect to docs-api from client outside Docker Set environment variable for host access in docker-compose.yml or .env: ```yaml docs-api: environment: - DOCS_API_URL=http://host.docker.internal:8787 ``` For MCP server specifically: ```yaml docs-mcp: environment: - DOCS_API_URL=http://host.docker.internal:8787 ``` ## Reset Qdrant and SQLite To clear all data (vector store and database): ```bash # Stop services docker compose down # Remove volumes (delete Qdrant and db.sqlite) rm -rf ./data/qdrant ./data/db.sqlite # Restart fresh docker compose up -d --build ``` ## Expose Through Caddy Reverse Proxy To add HTTPS and serve under a subdomain, configure Caddy: **Example `Caddyfile`:** ```caddyfile docs.yourdomain.com { reverse_proxy docs-api:8787 handle_path /mcp/* { reverse_proxy docs-mcp:8788 } # Enable basic auth (optional, see below) } api.yourdomain.com { reverse_proxy docs-api:8787 } mcp.yourdomain.com { reverse_proxy docs-mcp:8788 } ``` ## Protect It with Basic Auth Add authentication using Caddy's built-in `auth_handler` module or `caddy-dedupe-auth`: **Caddy example with basic auth:** ```caddyfile docs.yourdomain.com { reverse_proxy docs-api:8787 auth_token YOUR_API_TOKEN response_header_accessor path } ``` Or using the caddy `basic` module from scratch in a reverse proxy setup. For Docker-based deployment, consider using an authentication middleware or a dedicated reverse proxy with JWT/HTTP Basic configured externally. ## Future Improvements - Add rate limiting to API endpoints - Support for streaming responses for large document retrieval - Chunk overlap configuration via environment variables - Batch index endpoint improvements - Metrics/logging aggregation (e.g., Prometheus + Grafana) - Plugin system for additional data sources