A production-ready RAG (Retrieval-Augmented Generation) system that enables natural language querying of your codebase using vector search and LLM-powered responses.
Components:
- Qdrant - Vector database for embeddings storage and semantic search
- FastAPI - Python backend API with query orchestration
- Nginx - Reverse proxy and static file serving
- Indexer - JSON ingestion pipeline for codebase documents
- Widget - Embeddable chat widget (CSS + JS)
- OpenAI - GPT-4o for response generation and text-embedding-3-small for embeddings
# 1. Clone and Configure
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
# 2. Start services
docker compose up -d
# 3. Index projects using the skill
# "Index my cemoche.github.io project"
# 4. Run indexer
docker compose --profile indexer run --rm indexer
# 5. Test
curl -X POST http://localhost:8080/api/query \
-H "Content-Type: application/json" \
-d '{"question": "What stack did you use?"}'| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check with Qdrant connectivity status |
POST |
/api/query |
Submit a question, get answer + source attribution |
curl -X POST http://localhost:8080/api/query \
-H "Content-Type: application/json" \
-d '{"question": "What stack did you use?"}'Response:
{
"answer": "The project uses...",
"sources": [
{
"project": "cemoche.github.io",
"source_file": "src/components/Hero.tsx",
"title": "Hero Component",
"score": 0.92
}
]
}| Service | Port | Protocol | Notes |
|---|---|---|---|
| Qdrant | 7345 | HTTP | External access to Qdrant dashboard |
| Qdrant | 7346 | gRPC | gRPC API |
| Nginx | 8080 | HTTP | Main application entry point |
| Nginx | 8443 | HTTPS | SSL/TLS (when configured) |
| FastAPI | 8000 | HTTP | Internal only, proxied via Nginx |
ask-my-codebase/
├── docker-compose.yml
├── .env.example
├── README.md
├── api/ # FastAPI backend
│ ├── main.py
│ ├── query_engine.py
│ ├── models.py
│ ├── config.py
│ ├── Dockerfile
│ └── requirements.txt
├── indexer/ # JSON ingestion
│ ├── index.py
│ ├── json_loader.py
│ ├── Dockerfile
│ └── requirements.txt
├── nginx/ # Reverse proxy
│ ├── Dockerfile
│ ├── nginx.conf
│ └── conf.d/
│ └── default.conf
├── widget/ # Chat widget
│ ├── chat-widget.css
│ └── chat-widget.js
├── qdrant/ # Qdrant config
│ └── Dockerfile
└── .opencode/skills/ # Custom skills
└── rag-codebase-structuring/
└── SKILL.md
Add to your HTML:
<link rel="stylesheet" href="/chat-widget.css">
<script src="/chat-widget.js"></script>The widget will auto-initialize and connect to /api/query via the Nginx proxy.
# View API logs
docker compose logs -f api
# Stop all services and remove volumes
docker compose down -v
# Rebuild API after code changes
docker compose build api
# Check health
curl http://localhost:8080/health
# Access Qdrant dashboard
http://localhost:7345/dashboardCopy .env.example to .env and configure:
| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY |
Yes | - | OpenAI API key |
QDRANT_HOST |
No | qdrant |
Qdrant hostname |
QDRANT_PORT |
No | 6333 |
Qdrant port |
QDRANT_COLLECTION |
No | codebase_knowledge |
Vector collection name |
LLM_MODEL |
No | gpt-4o |
OpenAI LLM model |
EMBEDDING_MODEL |
No | text-embedding-3-small |
Embedding model |
MIT