# litellm-vector-store A vector store service built on top of [LiteLLM](https://github.com/BerriAI/litellm) and [pgvector](https://github.com/pgvector/pgvector), providing an OpenAI-compatible API for semantic search, document storage and Retrieval Augmented Generation (RAG). ## Features - 🔐 **Authentication** via LiteLLM API Keys - 🗄️ **Vector Store** powered by PostgreSQL + pgvector - 🔍 **Semantic Search** with optional Reranking - 🤖 **RAG Endpoint** - Search + LLM in one request - 📄 **File Upload** - PDF, DOCX, TXT, Markdown - 🧩 **OpenAI-compatible API** - works with existing OpenAI SDKs - 👥 **Multi-User** - Store permissions per user - 🖥️ **Admin UI** - Manage users, stores and permissions - 📊 **Usage Tracking** - Track requests per user ## Architecture ``` Client (API Key) │ ▼ LiteLLM Proxy ──────────────────────┐ │ │ ▼ ▼ Vector Store API Embedding Models │ (via LiteLLM) ▼ PostgreSQL + pgvector ``` ## Requirements - Kubernetes Cluster - PostgreSQL with pgvector extension - LiteLLM Proxy (deployed) - Container Registry ## Quick Start ### 1. Clone Repository ```bash git clone https://github.com/your-org/litellm-vector-store.git cd litellm-vector-store ``` ### 2. Database Setup ```bash kubectl exec -it -n \ -- psql -U postgres -d vectordb << 'EOF' CREATE EXTENSION IF NOT EXISTS vector; CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; CREATE TABLE IF NOT EXISTS vector_stores ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL, owner_user_id VARCHAR(255) NOT NULL, created_at TIMESTAMP DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS documents ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), store_id UUID REFERENCES vector_stores(id) ON DELETE CASCADE, content TEXT NOT NULL, metadata JSONB DEFAULT '{}', embedding vector(1024), created_at TIMESTAMP DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS store_permissions ( store_id UUID REFERENCES vector_stores(id) ON DELETE CASCADE, user_id VARCHAR(255) NOT NULL, permission VARCHAR(50) DEFAULT 'read', PRIMARY KEY (store_id, user_id) ); CREATE TABLE IF NOT EXISTS usage_stats ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id VARCHAR(255) NOT NULL, store_id UUID REFERENCES vector_stores(id) ON DELETE SET NULL, action VARCHAR(50) NOT NULL, tokens INT DEFAULT 0, duration FLOAT DEFAULT 0, created_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX IF NOT EXISTS idx_documents_store ON documents(store_id); CREATE INDEX IF NOT EXISTS idx_documents_embedding ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); CREATE INDEX IF NOT EXISTS idx_usage_user ON usage_stats(user_id); CREATE INDEX IF NOT EXISTS idx_usage_created ON usage_stats(created_at); GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO vecuser; GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO vecuser; EOF ``` ### 3. Configure ```bash kubectl create secret generic vector-api-secrets \ --namespace vector-store \ --from-literal=DATABASE_URL="postgresql://vecuser:pass@postgres:5432/vectordb" \ --from-literal=LITELLM_MASTER_KEY="sk-master-key" ``` ```yaml # k8s/configmap.yaml apiVersion: v1 kind: ConfigMap metadata: name: vector-store-config namespace: vector-store data: LITELLM_PROXY_URL: "http://litellm..svc.cluster.local:4000" ADMIN_USER_IDS: "your-admin-user-id" API_URL: "https://api.your-domain.com" EMBEDDING_MODEL: "your-embedding-model" ``` ### 4. Build & Deploy ```bash # API docker build -t your-registry/vector-store-api:1.0.0 . docker push your-registry/vector-store-api:1.0.0 # Admin UI docker build \ -t your-registry/vector-store-admin:1.0.0 \ ./ui docker push your-registry/vector-store-admin:1.0.0 # Deploy kubectl apply -f k8s/namespace.yaml kubectl apply -f k8s/configmap.yaml kubectl apply -f k8s/secrets.yaml kubectl apply -f k8s/vector-api/ kubectl apply -f k8s/admin-ui/ kubectl apply -f k8s/ingress-api.yaml kubectl apply -f k8s/ingress-ui.yaml ``` ## Project Structure ``` litellm-vector-store/ ├── app/ # FastAPI Backend │ ├── main.py # Application entry point │ ├── auth.py # LiteLLM authentication │ ├── database.py # PostgreSQL connection │ ├── models.py # Pydantic models │ ├── routers/ │ │ ├── stores.py # Vector store CRUD │ │ ├── documents.py # Document management │ │ ├── admin.py # Admin endpoints │ │ └── openai_compat.py # OpenAI-compatible API │ └── utils/ │ ├── chunking.py # Text chunking │ └── stats.py # Usage tracking ├── ui/ # React Admin UI │ ├── src/ │ │ ├── pages/ │ │ │ ├── Login.tsx │ │ │ ├── Dashboard.tsx │ │ │ ├── Users.tsx │ │ │ └── Stores.tsx │ │ ├── components/ │ │ │ ├── Layout.tsx │ │ │ └── PermissionModal.tsx │ │ └── api/ │ │ └── client.ts │ └── Dockerfile ├── k8s/ # Kubernetes manifests │ ├── namespace.yaml │ ├── configmap.yaml │ ├── secrets.yaml │ ├── vector-api/ │ │ ├── deployment.yaml │ │ └── service.yaml │ ├── admin-ui/ │ │ ├── deployment.yaml │ │ └── service.yaml │ ├── ingress-api.yaml │ └── ingress-ui.yaml ├── scripts/ │ └── init.sql # Database initialization ├── Dockerfile ├── requirements.txt └── README.md ``` ## API Reference ### Base URL ``` https://api.your-domain.com/v1 ``` ### Authentication ``` Authorization: Bearer sk-your-api-key ``` ### Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | `POST` | `/v1/vector_stores` | Create store | | `GET` | `/v1/vector_stores` | List stores | | `GET` | `/v1/vector_stores/{id}` | Get store | | `DELETE` | `/v1/vector_stores/{id}` | Delete store | | `POST` | `/v1/vector_stores/{id}/files` | Add texts | | `GET` | `/v1/vector_stores/{id}/files` | List files | | `DELETE` | `/v1/vector_stores/{id}/files/{file_id}` | Delete file | | `POST` | `/v1/vector_stores/{id}/upload` | Upload file | | `POST` | `/v1/vector_stores/{id}/search` | Search | | `POST` | `/v1/vector_stores/{id}/rag` | RAG query | | `POST` | `/v1/embeddings` | Create embeddings | | `GET` | `/v1/embeddings/models` | List embedding models | | `GET` | `/v1/models` | List all models | ### Example ```python import httpx client = httpx.Client( base_url="https://api.your-domain.com/v1", headers={"Authorization": "Bearer sk-your-key"} ) # Create store store = client.post( "/vector_stores", json={"name": "My Knowledge Base"} ).json() # Upload file with open("document.pdf", "rb") as f: client.post( f"/vector_stores/{store['id']}/upload", files={"file": f} ) # Search results = client.post( f"/vector_stores/{store['id']}/search", json={"query": "What is FastAPI?", "top_k": 3} ).json() # RAG answer = client.post( f"/vector_stores/{store['id']}/rag", json={"query": "What is FastAPI?"} ).json() print(answer["answer"]) ``` ## Configuration Reference ### Environment Variables | Variable | Required | Default | Description | |----------|----------|---------|-------------| | `DATABASE_URL` | ✅ | — | PostgreSQL connection URL | | `LITELLM_PROXY_URL` | ✅ | — | LiteLLM proxy URL | | `LITELLM_MASTER_KEY` | ✅ | — | LiteLLM master key | | `ADMIN_USER_IDS` | ✅ | — | Comma-separated admin user IDs | | `EMBEDDING_MODEL` | ❌ | `text-embedding-ada-002` | Default embedding model | ### Supported File Formats | Format | Extension | Notes | |--------|-----------|-------| | Text | `.txt` | UTF-8 encoded | | PDF | `.pdf` | Text PDFs only, no scans | | Word | `.docx` | Microsoft Word 2007+ | | Markdown | `.md` | Standard Markdown | ### Limits | Limit | Value | |-------|-------| | Max file size | 256 MB | | Max search results | 50 | | Request timeout | 600 seconds | | Default chunk size | 512 characters | | Default chunk overlap | 50 characters | ## Admin UI The Admin UI is available at `https://admin.your-domain.com`. Login with your Admin API Key to: - 📊 View usage statistics - 👥 Manage users and their stores - 🔑 Rotate API keys - 🔒 Grant/revoke store permissions ## Development ```bash # Install dependencies pip install -r requirements.txt # Run locally DATABASE_URL="postgresql://..." \ LITELLM_PROXY_URL="http://..." \ LITELLM_MASTER_KEY="sk-..." \ ADMIN_USER_IDS="your-id" \ uvicorn app.main:app --reload # Run UI locally cd ui npm install VITE_API_URL=http://localhost:8000 npm run dev ``` ## Tech Stack | Component | Technology | |-----------|-----------| | **API** | FastAPI + Python 3.12 | | **Database** | PostgreSQL 16 + pgvector | | **Auth** | LiteLLM Key Management | | **Embeddings** | Via LiteLLM Proxy | | **Admin UI** | React + TypeScript + Tailwind CSS | | **Container** | Docker + Kubernetes | | **Ingress** | NGINX Ingress Controller | | **TLS** | cert-manager + Let's Encrypt | ## License MIT License - see [LICENSE](LICENSE) for details. ## Contributing 1. Fork the repository 2. Create your feature branch (`git checkout -b feature/my-feature`) 3. Commit your changes (`git commit -m 'Add my feature'`) 4. Push to the branch (`git push origin feature/my-feature`) 5. Open a Pull Request