# litellm-vector-store A vector store service built on top of [LiteLLM](https://github.com/BerriAI/litellm) and [pgvector](https://github.com/pgvector/pgvector), providing an OpenAI-compatible API for semantic search, document storage and Retrieval Augmented Generation (RAG). ## Features - 🔐 **Authentication** via LiteLLM API Keys - 🗄️ **Vector Store** powered by PostgreSQL + pgvector - 🔍 **Semantic Search** with optional Reranking - 🤖 **RAG Endpoint** - Search + LLM in one request - 📄 **File Upload** - PDF, DOCX, TXT, Markdown, Excel, CSV, PowerPoint, HTML, E-Mail, JSON - 🖼️ **Image Support** - Upload images via Vision LLM (JPG, PNG, GIF, WebP, TIFF) - 🧩 **OpenAI-compatible API** - works with existing OpenAI SDKs - 👥 **Multi-User** - Store permissions per user - 🖥️ **Admin UI** - Manage users, stores and permissions - 📊 **Usage Tracking** - Track requests per user ## Architecture ``` Client (API Key) │ ▼ LiteLLM Proxy ──────────────────────────────┐ │ │ ▼ ▼ Vector Store API LiteLLM Models │ ┌──────────────────┐ ▼ │ Embedding Models │ PostgreSQL + pgvector │ Vision Models │ │ LLM Models │ └──────────────────┘ ``` ## Requirements - Kubernetes Cluster - PostgreSQL with pgvector extension (already deployed) - LiteLLM Proxy (already deployed) - Container Registry ## Quick Start ### 1. Clone Repository ```bash git clone https://github.com/your-org/litellm-vector-store.git cd litellm-vector-store ``` ### 2. Database Setup ```bash kubectl exec -it -n \ -- psql -U postgres -d vectordb << 'EOF' CREATE EXTENSION IF NOT EXISTS vector; CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; CREATE TABLE IF NOT EXISTS vector_stores ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL, owner_user_id VARCHAR(255) NOT NULL, created_at TIMESTAMP DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS documents ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), store_id UUID REFERENCES vector_stores(id) ON DELETE CASCADE, content TEXT NOT NULL, metadata JSONB DEFAULT '{}', embedding vector(1024), created_at TIMESTAMP DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS store_permissions ( store_id UUID REFERENCES vector_stores(id) ON DELETE CASCADE, user_id VARCHAR(255) NOT NULL, permission VARCHAR(50) DEFAULT 'read', PRIMARY KEY (store_id, user_id) ); CREATE TABLE IF NOT EXISTS usage_stats ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id VARCHAR(255) NOT NULL, store_id UUID REFERENCES vector_stores(id) ON DELETE SET NULL, action VARCHAR(50) NOT NULL, tokens INT DEFAULT 0, duration FLOAT DEFAULT 0, created_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX IF NOT EXISTS idx_documents_store ON documents(store_id); CREATE INDEX IF NOT EXISTS idx_documents_embedding ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); CREATE INDEX IF NOT EXISTS idx_usage_user ON usage_stats(user_id); CREATE INDEX IF NOT EXISTS idx_usage_created ON usage_stats(created_at); GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO vecuser; GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO vecuser; EOF ``` ### 3. Configure ```bash # Create secrets kubectl create secret generic vector-api-secrets \ --namespace vector-store \ --from-literal=DATABASE_URL="postgresql://vecuser:pass@postgres:5432/vectordb" \ --from-literal=LITELLM_MASTER_KEY="sk-master-key" ``` ```yaml # k8s/configmap.yaml apiVersion: v1 kind: ConfigMap metadata: name: vector-store-config namespace: vector-store data: LITELLM_PROXY_URL: "http://litellm..svc.cluster.local:4000" ADMIN_USER_IDS: "your-admin-user-id" API_URL: "https://api.your-domain.com" EMBEDDING_MODEL: "your-embedding-model" VISION_MODEL: "openai/gpt-4o-mini" ``` ### 4. Build & Deploy ```bash # Build & push API docker build -t your-registry/vector-store-api:1.0.0 . docker push your-registry/vector-store-api:1.0.0 # Build & push Admin UI docker build \ -t your-registry/vector-store-admin:1.0.0 \ ./ui docker push your-registry/vector-store-admin:1.0.0 # Deploy kubectl apply -f k8s/namespace.yaml kubectl apply -f k8s/configmap.yaml kubectl apply -f k8s/secrets.yaml kubectl apply -f k8s/vector-api/ kubectl apply -f k8s/admin-ui/ kubectl apply -f k8s/ingress-api.yaml kubectl apply -f k8s/ingress-ui.yaml ``` ## Project Structure ``` litellm-vector-store/ ├── app/ # FastAPI Backend │ ├── main.py # Application entry point │ ├── auth.py # LiteLLM authentication │ ├── database.py # PostgreSQL connection │ ├── models.py # Pydantic models │ ├── routers/ │ │ ├── stores.py # Vector store CRUD │ │ ├── documents.py # Document management │ │ ├── admin.py # Admin endpoints │ │ └── openai_compat.py # OpenAI-compatible API │ └── utils/ │ ├── chunking.py # Text chunking │ ├── image_processor.py # Vision LLM integration │ └── stats.py # Usage tracking ├── ui/ # React Admin UI │ ├── src/ │ │ ├── pages/ │ │ │ ├── Login.tsx │ │ │ ├── Dashboard.tsx │ │ │ ├── Users.tsx │ │ │ └── Stores.tsx │ │ ├── components/ │ │ │ ├── Layout.tsx │ │ │ └── PermissionModal.tsx │ │ └── api/ │ │ └── client.ts │ └── Dockerfile ├── k8s/ # Kubernetes manifests │ ├── namespace.yaml │ ├── configmap.yaml │ ├── secrets.yaml │ ├── vector-api/ │ │ ├── deployment.yaml │ │ └── service.yaml │ ├── admin-ui/ │ │ ├── deployment.yaml │ │ └── service.yaml │ ├── ingress-api.yaml │ └── ingress-ui.yaml ├── scripts/ │ └── init.sql # Database initialization ├── Dockerfile ├── requirements.txt └── README.md ``` ## API Reference ### Base URL ``` https://api.your-domain.com/v1 ``` ### Authentication ``` Authorization: Bearer sk-your-api-key ``` ### Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | `GET` | `/v1/models` | List all models | | `GET` | `/v1/embeddings/models` | List embedding models | | `GET` | `/v1/vision/models` | List vision models | | `POST` | `/v1/embeddings` | Create embeddings | | `POST` | `/v1/vector_stores` | Create store | | `GET` | `/v1/vector_stores` | List stores | | `GET` | `/v1/vector_stores/{id}` | Get store | | `DELETE` | `/v1/vector_stores/{id}` | Delete store | | `POST` | `/v1/vector_stores/{id}/files` | Add texts | | `GET` | `/v1/vector_stores/{id}/files` | List files | | `DELETE` | `/v1/vector_stores/{id}/files/{file_id}` | Delete file | | `POST` | `/v1/vector_stores/{id}/upload` | Upload file or image | | `POST` | `/v1/vector_stores/{id}/search` | Semantic search | | `POST` | `/v1/vector_stores/{id}/rag` | RAG query | ### Examples #### Store anlegen & Datei hochladen ```python import httpx client = httpx.Client( base_url="https://api.your-domain.com/v1", headers={"Authorization": "Bearer sk-your-key"}, timeout=120.0 ) # Create store store = client.post( "/vector_stores", json={"name": "My Knowledge Base"} ).json() # Upload document with open("document.pdf", "rb") as f: client.post( f"/vector_stores/{store['id']}/upload", files={"file": f} ) # Upload image (with default vision model) with open("screenshot.png", "rb") as f: client.post( f"/vector_stores/{store['id']}/upload", files={"file": f} ) # Upload image (with custom vision model) with open("diagram.png", "rb") as f: client.post( f"/vector_stores/{store['id']}/upload", files={"file": f}, data={ "vision_model": "openai/gpt-4o", "vision_prompt": "Explain this diagram in detail." } ) # Search results = client.post( f"/vector_stores/{store['id']}/search", json={ "query": "What is FastAPI?", "top_k": 3, "rerank": True } ).json() # RAG answer = client.post( f"/vector_stores/{store['id']}/rag", json={ "query": "What is FastAPI?", "model": "openai/gpt-4o-mini", "rerank": True } ).json() print(answer["answer"]) ``` #### JavaScript / TypeScript ```javascript const API_KEY = "sk-your-api-key"; const BASE_URL = "https://api.your-domain.com/v1"; const HEADERS = { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json" }; // Create store const store = await fetch(`${BASE_URL}/vector_stores`, { method: "POST", headers: HEADERS, body: JSON.stringify({ name: "My Store" }) }).then(r => r.json()); // Search const results = await fetch( `${BASE_URL}/vector_stores/${store.id}/search`, { method: "POST", headers: HEADERS, body: JSON.stringify({ query: "What is FastAPI?", top_k: 3, rerank: true }) }).then(r => r.json()); // RAG const answer = await fetch( `${BASE_URL}/vector_stores/${store.id}/rag`, { method: "POST", headers: HEADERS, body: JSON.stringify({ query: "What is FastAPI?" }) }).then(r => r.json()); console.log(answer.answer); ``` #### curl ```bash # Create store curl -X POST https://api.your-domain.com/v1/vector_stores \ -H "Authorization: Bearer sk-your-key" \ -H "Content-Type: application/json" \ -d '{"name": "My Store"}' # Upload document curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/upload \ -H "Authorization: Bearer sk-your-key" \ -F "file=@document.pdf" # Upload image with custom vision model curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/upload \ -H "Authorization: Bearer sk-your-key" \ -F "file=@diagram.png" \ -F "vision_model=openai/gpt-4o" \ -F "vision_prompt=Explain this diagram in detail." # Search curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/search \ -H "Authorization: Bearer sk-your-key" \ -H "Content-Type: application/json" \ -d '{"query": "What is FastAPI?", "top_k": 3, "rerank": true}' # RAG curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/rag \ -H "Authorization: Bearer sk-your-key" \ -H "Content-Type: application/json" \ -d '{"query": "What is FastAPI?", "model": "openai/gpt-4o-mini"}' ``` ## Configuration Reference ### Environment Variables | Variable | Required | Default | Description | |----------|----------|---------|-------------| | `DATABASE_URL` | ✅ | — | PostgreSQL connection URL | | `LITELLM_PROXY_URL` | ✅ | — | LiteLLM proxy URL | | `LITELLM_MASTER_KEY` | ✅ | — | LiteLLM master key | | `ADMIN_USER_IDS` | ✅ | — | Comma-separated admin user IDs | | `EMBEDDING_MODEL` | ❌ | `text-embedding-ada-002` | Default embedding model | | `VISION_MODEL` | ❌ | `openai/gpt-4o-mini` | Default vision model | ### Upload Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `file` | file | — | File to upload | | `chunk_size` | int | 512 | Characters per chunk | | `chunk_overlap` | int | 50 | Overlap between chunks | | `vision_model` | string | Config default | Vision model for images | | `vision_prompt` | string | Auto | Custom prompt for vision model | ### Search Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `query` | string | — | Search query | | `top_k` | int | 5 | Number of results (max. 50) | | `rerank` | bool | false | Enable reranking | | `rerank_model` | string | Auto | Custom rerank model | ### RAG Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `query` | string | — | Question | | `model` | string | cosair/gemma4:31b | LLM model | | `top_k` | int | 5 | Context documents | | `rerank` | bool | false | Enable reranking | | `system_prompt` | string | Auto | Custom system prompt | | `messages` | array | [] | Chat history | ### Supported File Formats | Format | Extension | Notes | |--------|-----------|-------| | Text | `.txt` | UTF-8 encoded | | Markdown | `.md` | Standard Markdown | | PDF | `.pdf` | Text PDFs only, no scans | | Word | `.docx` | Microsoft Word 2007+ | | Excel | `.xlsx` | All sheets extracted | | CSV | `.csv` | All columns extracted | | PowerPoint | `.pptx` | All slides extracted | | HTML | `.html` `.htm` | Scripts/styles removed | | Outlook Mail | `.msg` | Including headers | | E-Mail | `.eml` | Including headers | | JSON | `.json` | Pretty printed | | Image | `.jpg` `.jpeg` `.png` `.gif` `.webp` `.tiff` | Via Vision LLM | ### Limits | Limit | Value | |-------|-------| | Max file size | 256 MB | | Max search results | 50 | | Request timeout | 600 seconds | | Default chunk size | 512 characters | | Default chunk overlap | 50 characters | ## Admin UI The Admin UI is available at `https://admin.your-domain.com`. Login with your Admin API Key to: - 📊 View usage statistics - 👥 Manage users and their stores - 🔑 Rotate API keys - 🔒 Grant/revoke store permissions ## Development ```bash # Install dependencies pip install -r requirements.txt # Run locally DATABASE_URL="postgresql://..." \ LITELLM_PROXY_URL="http://..." \ LITELLM_MASTER_KEY="sk-..." \ ADMIN_USER_IDS="your-id" \ EMBEDDING_MODEL="your-model" \ VISION_MODEL="openai/gpt-4o-mini" \ uvicorn app.main:app --reload # Run UI locally cd ui npm install VITE_API_URL=http://localhost:8000 npm run dev ``` ## Tech Stack | Component | Technology | |-----------|-----------| | **API** | FastAPI + Python 3.12 | | **Database** | PostgreSQL 16 + pgvector | | **Auth** | LiteLLM Key Management | | **Embeddings** | Via LiteLLM Proxy | | **Vision** | Via LiteLLM Vision Models | | **Admin UI** | React + TypeScript + Tailwind CSS | | **Container** | Docker + Kubernetes | | **Ingress** | NGINX Ingress Controller | | **TLS** | cert-manager + Let's Encrypt | ## License MIT License - see [LICENSE](LICENSE) for details. ## Contributing 1. Fork the repository 2. Create your feature branch (`git checkout -b feature/my-feature`) 3. Commit your changes (`git commit -m 'Add my feature'`) 4. Push to the branch (`git push origin feature/my-feature`) 5. Open a Pull Request