Initial DocsMCP stack
This commit is contained in:
@@ -0,0 +1,431 @@
|
||||
# Context7-style Docs MCP System
|
||||
|
||||
A self-hosted, local-compatible documentation retrieval and search system using Docker. This project uses Qdrant for vector embeddings and SQLite for metadata storage, exposing a FastAPI docs backend and an MCP server for IDE/tool integration.
|
||||
|
||||
## 🏠 Home Server / Production Use
|
||||
|
||||
This section covers hardening recommendations for running this system on a home server or in production.
|
||||
|
||||
### Environment Variables (`.env`)
|
||||
|
||||
Copy `.env.example` to `.env` and configure:
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
| Variable | Description | Example |
|
||||
|----------|-------------|---------|
|
||||
| `HOST_PORT` | Docs API host port (default: 8787) | `8787` |
|
||||
| `MCP_HOST_PORT` | MCP server host port (default: 8788) | `8788` |
|
||||
| `DOCS_API_KEY` | API key for docs-api authentication (optional) | `my-secret-key-123` |
|
||||
| `MCP_API_KEY` | API key for MCP server authentication (optional, FastMCP handles via --key flag conceptually) | `mcp-secret-key` |
|
||||
| `DOCS_PATH` | Path to documentation files inside container | `/docs` |
|
||||
| `DB_PATH` | SQLite database path inside container | `/data/db.sqlite` |
|
||||
| `LOG_LEVEL` | Logging level: DEBUG, INFO, WARNING, ERROR | `INFO` |
|
||||
|
||||
> **Security Note:** API keys are optional. Leave empty in `.env` if you don't need authentication (backward compatible with existing setups). If set, the docs-api requires an `X-API-Key` header matching `DOCS_API_KEY` for protected endpoints.
|
||||
|
||||
### Port Configuration
|
||||
|
||||
For firewall or network setup:
|
||||
|
||||
```bash
|
||||
# Example: Run docs-api on port 9000 instead of 8787
|
||||
HOST_PORT=9000 MCP_HOST_PORT=9001 docker compose up -d --build
|
||||
```
|
||||
|
||||
### Backup Instructions
|
||||
|
||||
#### SQLite Database (`data/db.sqlite`)
|
||||
|
||||
Regular SQLite backups prevent data loss. Example cron job:
|
||||
|
||||
```bash
|
||||
# Add to crontab (run daily at 2am)
|
||||
0 2 * * * docker compose exec docs-api sqlite3 /data/db.sqlite ".backup '/backups/db_$(date +%Y%m%d).sqlite'"
|
||||
```
|
||||
|
||||
Or one-off backup:
|
||||
|
||||
```bash
|
||||
docker compose exec docs-api sh -c "sqlite3 /data/db.sqlite '.dump' | gzip > /backups/db-$(date +%Y%m%d-%H%M%S).sql.gz"
|
||||
```
|
||||
|
||||
#### Qdrant Vector Store
|
||||
|
||||
Qdrant stores vectors in `./data/qdrant`. For backup:
|
||||
|
||||
```bash
|
||||
# Backup entire Qdrant data directory
|
||||
docker compose exec qdrant sh -c "tar czf /backups/qdrant-backup-$(date +%Y%m%d).tar.gz /qdrant/storage"
|
||||
|
||||
# Or pull full export to host (requires volume mount)
|
||||
docker run --rm -v local-context7_data:/data -v $(pwd)/backups:/backups qdrant/qdrant:latest tar czf /backups/qdrant-backup-$(date +%Y%m%d).tar.gz /qdrant/storage
|
||||
```
|
||||
|
||||
### Safe Reset Command
|
||||
|
||||
To reset both SQLite and Qdrant cleanly:
|
||||
|
||||
```bash
|
||||
docker compose down -v # Removes volumes and stops services
|
||||
rm ./data/db.sqlite # Remove database file
|
||||
rm -rf ./data/qdrant # Remove Qdrant data
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
Or use the `make reset` command below.
|
||||
|
||||
### Makefile Commands
|
||||
|
||||
The included `Makefile` provides convenient commands:
|
||||
|
||||
```bash
|
||||
# Start services
|
||||
make up
|
||||
|
||||
# Stop services
|
||||
make down
|
||||
|
||||
# Rebuild and restart
|
||||
make restart
|
||||
|
||||
# Backup database
|
||||
make backup-db BACKUP_PATH=/backups/db-$(date +%Y%m%d).sqlite.gz
|
||||
|
||||
# Reset everything (delete volumes)
|
||||
make reset
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ Client │────▶│ docs-api │◀────│ docs-mcp │
|
||||
│ (IDE/Tool) │ │ (FastAPI) │ │ (MCP Server)│
|
||||
└─────────────┘ └─────────────┘ └─────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ Qdrant │
|
||||
│ (Vector DB) │
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
**Components:**
|
||||
- `qdrant` — Vector database storing document embeddings
|
||||
- `docs-api` — FastAPI backend exposing ingestion, search, and library endpoints
|
||||
- `docs-mcp` — MCP server providing tools for Context7-style AI interactions
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker Engine v20.10+
|
||||
- Docker Compose
|
||||
- ~500MB free disk space (Qdrant + embedding model)
|
||||
|
||||
## Setup
|
||||
|
||||
1. **Download the project** and change into its directory:
|
||||
|
||||
```bash
|
||||
cd local-context7
|
||||
```
|
||||
|
||||
2. **Copy environment file:**
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
3. **(Optional) Create sample docs:**
|
||||
|
||||
```bash
|
||||
mkdir -p docs/foundryvtt docs/fastapi docs/my-msfs-copilot
|
||||
```
|
||||
|
||||
4. **Start services:**
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
5. **Verify they're running:**
|
||||
|
||||
```bash
|
||||
docker compose ps
|
||||
```
|
||||
|
||||
You should see all three services (`qdrant`, `docs-api`, `docs-mcp`) in "Up" status.
|
||||
|
||||
6. **Wait for startup completion** (embedding model loads on first API call):
|
||||
|
||||
```bash
|
||||
docker compose logs -f docs-api # Watch for "Initialization complete."
|
||||
```
|
||||
|
||||
## Add Docs
|
||||
|
||||
Place your documentation folders under the root directory:
|
||||
|
||||
```bash
|
||||
mkdir -p docs/foundryvtt/docs
|
||||
cp /path/to/foundryvtt/*.md docs/foundryvtt/docs/
|
||||
mkdir -p docs/fastapi
|
||||
```
|
||||
|
||||
Supported file types: `.md`, `.txt`, `.py`, `.js`, `.ts`, `.json`, `.yaml`, `.yml`, `.html`, `.css`, `.pdf` (via pypdf).
|
||||
|
||||
To add new documents to the vector store after adding them, run:
|
||||
|
||||
```bash
|
||||
docker compose exec docs-api python -c "from app.ingest import ingest_all; import asyncio; asyncio.run(ingest_all())"
|
||||
```
|
||||
|
||||
Or from another terminal:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8787/api/v1/ingest/all \
|
||||
-H "Content-Type: application/json"
|
||||
```
|
||||
|
||||
## Index Docs (Run Ingestion)
|
||||
|
||||
After adding documents, index them into the vector store:
|
||||
|
||||
```bash
|
||||
docker compose exec docs-api python -c "from app.ingest import ingest_all; import asyncio; asyncio.run(ingest_all())"
|
||||
```
|
||||
|
||||
Expected output shows progress like:
|
||||
|
||||
```
|
||||
[Detection] Scanning for libraries in: /docs
|
||||
[Detection] Found 3 library(ies)
|
||||
[Library] Processing: foundryvtt
|
||||
[Library] Scanning for files in: /docs/foundryvtt
|
||||
[Library] Found 5 document(s)
|
||||
...
|
||||
```
|
||||
|
||||
## Search Docs
|
||||
|
||||
### Via API (POST to `/search`)
|
||||
|
||||
Request body:
|
||||
|
||||
```json
|
||||
{
|
||||
"query": "how do hooks work",
|
||||
"library_id": "foundryvtt",
|
||||
"limit": 10
|
||||
}
|
||||
```
|
||||
|
||||
Response example:
|
||||
|
||||
```json
|
||||
{
|
||||
"query": "hooks",
|
||||
"library_id": "foundryvtt",
|
||||
"results": [
|
||||
{
|
||||
"id": "...",
|
||||
"score": 0.854,
|
||||
"library_id": "foundryvtt",
|
||||
"path": "core-docs.md",
|
||||
"title": "Core Hooks",
|
||||
"chunk_index": 2
|
||||
}
|
||||
],
|
||||
"count": 1
|
||||
}
|
||||
```
|
||||
|
||||
### Via MCP (resolve-library-id, search-docs tools)
|
||||
|
||||
## Connect MCP Clients
|
||||
|
||||
To use this system with an MCP-enabled client (e.g., Claude Desktop), configure the MCP server endpoint.
|
||||
|
||||
### Example: Claude Desktop Config
|
||||
|
||||
Add to your `claude_desktop_config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"context7": {
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"@modelcontextprotocol/server-local-context7",
|
||||
"--url", "http://localhost:8788"
|
||||
],
|
||||
"env": {
|
||||
"DOCS_API_URL": "http://localhost:8787"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If the client runs outside Docker and can't reach the API, expose them on host ports or run the MCP server outside Docker (see below).
|
||||
|
||||
## Example: Cline/Cursor MCP Config
|
||||
|
||||
For Cursor or similar editors using Cline:
|
||||
|
||||
```json
|
||||
// ~/.cursor/mcp.json
|
||||
{
|
||||
"context7": {
|
||||
"type": "stdio",
|
||||
"command": "docker",
|
||||
"args": [
|
||||
"exec",
|
||||
"-it",
|
||||
"docs-mcp",
|
||||
"uvicorn",
|
||||
"server:app",
|
||||
"--host",
|
||||
"0.0.0.0",
|
||||
"--port",
|
||||
"8788"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Or if exposing MCP on host port:
|
||||
|
||||
```json
|
||||
{
|
||||
"context7": {
|
||||
"type": "stdio",
|
||||
"command": "docker",
|
||||
"args": [
|
||||
"run",
|
||||
"-it",
|
||||
"--rm",
|
||||
"-p",
|
||||
"8788:8788",
|
||||
"--name",
|
||||
"context7-mcp-standalone",
|
||||
"-e",
|
||||
"DOCS_API_URL=http://host.docker.internal:8787",
|
||||
"local-context7/docs-mcp"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Services won't start or restart loops
|
||||
|
||||
Check logs:
|
||||
|
||||
```bash
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
Common issues:
|
||||
- Port already in use on host → adjust mapping or free the port
|
||||
- Embedding model failing to load → verify disk space, check for GPU constraints if applicable
|
||||
|
||||
### Vector search returns empty results
|
||||
|
||||
Ensure you've run ingestion after adding docs:
|
||||
|
||||
```bash
|
||||
docker compose exec docs-api python -c "from app.ingest import ingest_all; import asyncio; asyncio.run(ingest_all())"
|
||||
```
|
||||
|
||||
### Can't connect to docs-api from client outside Docker
|
||||
|
||||
Set environment variable for host access in docker-compose.yml or .env:
|
||||
|
||||
```yaml
|
||||
docs-api:
|
||||
environment:
|
||||
- DOCS_API_URL=http://host.docker.internal:8787
|
||||
```
|
||||
|
||||
For MCP server specifically:
|
||||
|
||||
```yaml
|
||||
docs-mcp:
|
||||
environment:
|
||||
- DOCS_API_URL=http://host.docker.internal:8787
|
||||
```
|
||||
|
||||
## Reset Qdrant and SQLite
|
||||
|
||||
To clear all data (vector store and database):
|
||||
|
||||
```bash
|
||||
# Stop services
|
||||
docker compose down
|
||||
|
||||
# Remove volumes (delete Qdrant and db.sqlite)
|
||||
rm -rf ./data/qdrant ./data/db.sqlite
|
||||
|
||||
# Restart fresh
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
## Expose Through Caddy Reverse Proxy
|
||||
|
||||
To add HTTPS and serve under a subdomain, configure Caddy:
|
||||
|
||||
**Example `Caddyfile`:**
|
||||
|
||||
```caddyfile
|
||||
docs.yourdomain.com {
|
||||
reverse_proxy docs-api:8787
|
||||
handle_path /mcp/* {
|
||||
reverse_proxy docs-mcp:8788
|
||||
}
|
||||
|
||||
# Enable basic auth (optional, see below)
|
||||
}
|
||||
|
||||
api.yourdomain.com {
|
||||
reverse_proxy docs-api:8787
|
||||
}
|
||||
|
||||
mcp.yourdomain.com {
|
||||
reverse_proxy docs-mcp:8788
|
||||
}
|
||||
```
|
||||
|
||||
## Protect It with Basic Auth
|
||||
|
||||
Add authentication using Caddy's built-in `auth_handler` module or `caddy-dedupe-auth`:
|
||||
|
||||
**Caddy example with basic auth:**
|
||||
|
||||
```caddyfile
|
||||
docs.yourdomain.com {
|
||||
reverse_proxy docs-api:8787
|
||||
auth_token YOUR_API_TOKEN
|
||||
response_header_accessor path
|
||||
}
|
||||
```
|
||||
|
||||
Or using the caddy `basic` module from scratch in a reverse proxy setup.
|
||||
|
||||
For Docker-based deployment, consider using an authentication middleware or a dedicated reverse proxy with JWT/HTTP Basic configured externally.
|
||||
|
||||
## Future Improvements
|
||||
|
||||
- Add rate limiting to API endpoints
|
||||
- Support for streaming responses for large document retrieval
|
||||
- Chunk overlap configuration via environment variables
|
||||
- Batch index endpoint improvements
|
||||
- Metrics/logging aggregation (e.g., Prometheus + Grafana)
|
||||
- Plugin system for additional data sources
|
||||
Reference in New Issue
Block a user