509 lines
15 KiB
Markdown
509 lines
15 KiB
Markdown
# litellm-vector-store
|
|
|
|
A vector store service built on top of [LiteLLM](https://github.com/BerriAI/litellm) and [pgvector](https://github.com/pgvector/pgvector), providing an OpenAI-compatible API for semantic search, document storage and Retrieval Augmented Generation (RAG).
|
|
|
|
## Features
|
|
|
|
- 🔐 **Authentication** via LiteLLM API Keys
|
|
- 🗄️ **Vector Store** powered by PostgreSQL + pgvector
|
|
- 🔍 **Semantic Search** with optional Reranking
|
|
- 🤖 **RAG Endpoint** - Search + LLM in one request
|
|
- 📄 **File Upload** - PDF, DOCX, TXT, Markdown, Excel, CSV, PowerPoint, HTML, E-Mail, JSON
|
|
- 🖼️ **Image Support** - Upload images via Vision LLM (JPG, PNG, GIF, WebP, TIFF)
|
|
- 🧩 **OpenAI-compatible API** - works with existing OpenAI SDKs
|
|
- 👥 **Multi-User** - Store permissions per user
|
|
- 🖥️ **Admin UI** - Manage users, stores and permissions
|
|
- 📊 **Usage Tracking** - Track requests per user
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Client (API Key)
|
|
│
|
|
▼
|
|
LiteLLM Proxy ──────────────────────────────┐
|
|
│ │
|
|
▼ ▼
|
|
Vector Store API LiteLLM Models
|
|
│ ┌──────────────────┐
|
|
▼ │ Embedding Models │
|
|
PostgreSQL + pgvector │ Vision Models │
|
|
│ LLM Models │
|
|
└──────────────────┘
|
|
```
|
|
|
|
## Requirements
|
|
|
|
- Kubernetes Cluster
|
|
- PostgreSQL with pgvector extension (already deployed)
|
|
- LiteLLM Proxy (already deployed)
|
|
- Container Registry
|
|
|
|
## Quick Start
|
|
|
|
### 1. Clone Repository
|
|
|
|
```bash
|
|
git clone https://github.com/your-org/litellm-vector-store.git
|
|
cd litellm-vector-store
|
|
```
|
|
|
|
### 2. Database Setup
|
|
|
|
```bash
|
|
kubectl exec -it <postgres-pod> -n <namespace> \
|
|
-- psql -U postgres -d vectordb << 'EOF'
|
|
|
|
CREATE EXTENSION IF NOT EXISTS vector;
|
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
|
|
|
CREATE TABLE IF NOT EXISTS vector_stores (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
name VARCHAR(255) NOT NULL,
|
|
owner_user_id VARCHAR(255) NOT NULL,
|
|
created_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
|
|
CREATE TABLE IF NOT EXISTS documents (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
store_id UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
|
|
content TEXT NOT NULL,
|
|
metadata JSONB DEFAULT '{}',
|
|
embedding vector(1024),
|
|
created_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
|
|
CREATE TABLE IF NOT EXISTS store_permissions (
|
|
store_id UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
|
|
user_id VARCHAR(255) NOT NULL,
|
|
permission VARCHAR(50) DEFAULT 'read',
|
|
PRIMARY KEY (store_id, user_id)
|
|
);
|
|
|
|
CREATE TABLE IF NOT EXISTS usage_stats (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
user_id VARCHAR(255) NOT NULL,
|
|
store_id UUID REFERENCES vector_stores(id) ON DELETE SET NULL,
|
|
action VARCHAR(50) NOT NULL,
|
|
tokens INT DEFAULT 0,
|
|
duration FLOAT DEFAULT 0,
|
|
created_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_documents_store
|
|
ON documents(store_id);
|
|
CREATE INDEX IF NOT EXISTS idx_documents_embedding
|
|
ON documents USING ivfflat (embedding vector_cosine_ops)
|
|
WITH (lists = 100);
|
|
CREATE INDEX IF NOT EXISTS idx_usage_user
|
|
ON usage_stats(user_id);
|
|
CREATE INDEX IF NOT EXISTS idx_usage_created
|
|
ON usage_stats(created_at);
|
|
|
|
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO vecuser;
|
|
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO vecuser;
|
|
|
|
EOF
|
|
```
|
|
|
|
### 3. Configure
|
|
|
|
```bash
|
|
# Create secrets
|
|
kubectl create secret generic vector-api-secrets \
|
|
--namespace vector-store \
|
|
--from-literal=DATABASE_URL="postgresql://vecuser:pass@postgres:5432/vectordb" \
|
|
--from-literal=LITELLM_MASTER_KEY="sk-master-key"
|
|
```
|
|
|
|
```yaml
|
|
# k8s/configmap.yaml
|
|
apiVersion: v1
|
|
kind: ConfigMap
|
|
metadata:
|
|
name: vector-store-config
|
|
namespace: vector-store
|
|
data:
|
|
LITELLM_PROXY_URL: "http://litellm.<namespace>.svc.cluster.local:4000"
|
|
ADMIN_USER_IDS: "your-admin-user-id"
|
|
API_URL: "https://api.your-domain.com"
|
|
EMBEDDING_MODEL: "your-embedding-model"
|
|
VISION_MODEL: "openai/gpt-4o-mini"
|
|
```
|
|
|
|
### 4. Build & Deploy
|
|
|
|
```bash
|
|
# Build & push API
|
|
docker build -t your-registry/vector-store-api:1.0.0 .
|
|
docker push your-registry/vector-store-api:1.0.0
|
|
|
|
# Build & push Admin UI
|
|
docker build \
|
|
-t your-registry/vector-store-admin:1.0.0 \
|
|
./ui
|
|
docker push your-registry/vector-store-admin:1.0.0
|
|
|
|
# Deploy
|
|
kubectl apply -f k8s/namespace.yaml
|
|
kubectl apply -f k8s/configmap.yaml
|
|
kubectl apply -f k8s/secrets.yaml
|
|
kubectl apply -f k8s/vector-api/
|
|
kubectl apply -f k8s/admin-ui/
|
|
kubectl apply -f k8s/ingress-api.yaml
|
|
kubectl apply -f k8s/ingress-ui.yaml
|
|
```
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
litellm-vector-store/
|
|
├── app/ # FastAPI Backend
|
|
│ ├── main.py # Application entry point
|
|
│ ├── auth.py # LiteLLM authentication
|
|
│ ├── database.py # PostgreSQL connection
|
|
│ ├── models.py # Pydantic models
|
|
│ ├── routers/
|
|
│ │ ├── stores.py # Vector store CRUD
|
|
│ │ ├── documents.py # Document management
|
|
│ │ ├── admin.py # Admin endpoints
|
|
│ │ └── openai_compat.py # OpenAI-compatible API
|
|
│ └── utils/
|
|
│ ├── chunking.py # Text chunking
|
|
│ ├── image_processor.py # Vision LLM integration
|
|
│ └── stats.py # Usage tracking
|
|
├── ui/ # React Admin UI
|
|
│ ├── src/
|
|
│ │ ├── pages/
|
|
│ │ │ ├── Login.tsx
|
|
│ │ │ ├── Dashboard.tsx
|
|
│ │ │ ├── Users.tsx
|
|
│ │ │ └── Stores.tsx
|
|
│ │ ├── components/
|
|
│ │ │ ├── Layout.tsx
|
|
│ │ │ └── PermissionModal.tsx
|
|
│ │ └── api/
|
|
│ │ └── client.ts
|
|
│ └── Dockerfile
|
|
├── k8s/ # Kubernetes manifests
|
|
│ ├── namespace.yaml
|
|
│ ├── configmap.yaml
|
|
│ ├── secrets.yaml
|
|
│ ├── vector-api/
|
|
│ │ ├── deployment.yaml
|
|
│ │ └── service.yaml
|
|
│ ├── admin-ui/
|
|
│ │ ├── deployment.yaml
|
|
│ │ └── service.yaml
|
|
│ ├── ingress-api.yaml
|
|
│ └── ingress-ui.yaml
|
|
├── scripts/
|
|
│ └── init.sql # Database initialization
|
|
├── Dockerfile
|
|
├── requirements.txt
|
|
└── README.md
|
|
```
|
|
|
|
## API Reference
|
|
|
|
### Base URL
|
|
|
|
```
|
|
https://api.your-domain.com/v1
|
|
```
|
|
|
|
### Authentication
|
|
|
|
```
|
|
Authorization: Bearer sk-your-api-key
|
|
```
|
|
|
|
### Endpoints
|
|
|
|
| Method | Endpoint | Description |
|
|
|--------|----------|-------------|
|
|
| `GET` | `/v1/models` | List all models |
|
|
| `GET` | `/v1/embeddings/models` | List embedding models |
|
|
| `GET` | `/v1/vision/models` | List vision models |
|
|
| `POST` | `/v1/embeddings` | Create embeddings |
|
|
| `POST` | `/v1/vector_stores` | Create store |
|
|
| `GET` | `/v1/vector_stores` | List stores |
|
|
| `GET` | `/v1/vector_stores/{id}` | Get store |
|
|
| `DELETE` | `/v1/vector_stores/{id}` | Delete store |
|
|
| `POST` | `/v1/vector_stores/{id}/files` | Add texts |
|
|
| `GET` | `/v1/vector_stores/{id}/files` | List files |
|
|
| `DELETE` | `/v1/vector_stores/{id}/files/{file_id}` | Delete file |
|
|
| `POST` | `/v1/vector_stores/{id}/upload` | Upload file or image |
|
|
| `POST` | `/v1/vector_stores/{id}/search` | Semantic search |
|
|
| `POST` | `/v1/vector_stores/{id}/rag` | RAG query |
|
|
|
|
### Examples
|
|
|
|
#### Store anlegen & Datei hochladen
|
|
|
|
```python
|
|
import httpx
|
|
|
|
client = httpx.Client(
|
|
base_url="https://api.your-domain.com/v1",
|
|
headers={"Authorization": "Bearer sk-your-key"},
|
|
timeout=120.0
|
|
)
|
|
|
|
# Create store
|
|
store = client.post(
|
|
"/vector_stores",
|
|
json={"name": "My Knowledge Base"}
|
|
).json()
|
|
|
|
# Upload document
|
|
with open("document.pdf", "rb") as f:
|
|
client.post(
|
|
f"/vector_stores/{store['id']}/upload",
|
|
files={"file": f}
|
|
)
|
|
|
|
# Upload image (with default vision model)
|
|
with open("screenshot.png", "rb") as f:
|
|
client.post(
|
|
f"/vector_stores/{store['id']}/upload",
|
|
files={"file": f}
|
|
)
|
|
|
|
# Upload image (with custom vision model)
|
|
with open("diagram.png", "rb") as f:
|
|
client.post(
|
|
f"/vector_stores/{store['id']}/upload",
|
|
files={"file": f},
|
|
data={
|
|
"vision_model": "openai/gpt-4o",
|
|
"vision_prompt": "Explain this diagram in detail."
|
|
}
|
|
)
|
|
|
|
# Search
|
|
results = client.post(
|
|
f"/vector_stores/{store['id']}/search",
|
|
json={
|
|
"query": "What is FastAPI?",
|
|
"top_k": 3,
|
|
"rerank": True
|
|
}
|
|
).json()
|
|
|
|
# RAG
|
|
answer = client.post(
|
|
f"/vector_stores/{store['id']}/rag",
|
|
json={
|
|
"query": "What is FastAPI?",
|
|
"model": "openai/gpt-4o-mini",
|
|
"rerank": True
|
|
}
|
|
).json()
|
|
print(answer["answer"])
|
|
```
|
|
|
|
#### JavaScript / TypeScript
|
|
|
|
```javascript
|
|
const API_KEY = "sk-your-api-key";
|
|
const BASE_URL = "https://api.your-domain.com/v1";
|
|
const HEADERS = {
|
|
"Authorization": `Bearer ${API_KEY}`,
|
|
"Content-Type": "application/json"
|
|
};
|
|
|
|
// Create store
|
|
const store = await fetch(`${BASE_URL}/vector_stores`, {
|
|
method: "POST",
|
|
headers: HEADERS,
|
|
body: JSON.stringify({ name: "My Store" })
|
|
}).then(r => r.json());
|
|
|
|
// Search
|
|
const results = await fetch(
|
|
`${BASE_URL}/vector_stores/${store.id}/search`, {
|
|
method: "POST",
|
|
headers: HEADERS,
|
|
body: JSON.stringify({
|
|
query: "What is FastAPI?",
|
|
top_k: 3,
|
|
rerank: true
|
|
})
|
|
}).then(r => r.json());
|
|
|
|
// RAG
|
|
const answer = await fetch(
|
|
`${BASE_URL}/vector_stores/${store.id}/rag`, {
|
|
method: "POST",
|
|
headers: HEADERS,
|
|
body: JSON.stringify({
|
|
query: "What is FastAPI?"
|
|
})
|
|
}).then(r => r.json());
|
|
|
|
console.log(answer.answer);
|
|
```
|
|
|
|
#### curl
|
|
|
|
```bash
|
|
# Create store
|
|
curl -X POST https://api.your-domain.com/v1/vector_stores \
|
|
-H "Authorization: Bearer sk-your-key" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"name": "My Store"}'
|
|
|
|
# Upload document
|
|
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/upload \
|
|
-H "Authorization: Bearer sk-your-key" \
|
|
-F "file=@document.pdf"
|
|
|
|
# Upload image with custom vision model
|
|
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/upload \
|
|
-H "Authorization: Bearer sk-your-key" \
|
|
-F "file=@diagram.png" \
|
|
-F "vision_model=openai/gpt-4o" \
|
|
-F "vision_prompt=Explain this diagram in detail."
|
|
|
|
# Search
|
|
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/search \
|
|
-H "Authorization: Bearer sk-your-key" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query": "What is FastAPI?", "top_k": 3, "rerank": true}'
|
|
|
|
# RAG
|
|
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/rag \
|
|
-H "Authorization: Bearer sk-your-key" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query": "What is FastAPI?", "model": "openai/gpt-4o-mini"}'
|
|
```
|
|
|
|
## Configuration Reference
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Required | Default | Description |
|
|
|----------|----------|---------|-------------|
|
|
| `DATABASE_URL` | ✅ | — | PostgreSQL connection URL |
|
|
| `LITELLM_PROXY_URL` | ✅ | — | LiteLLM proxy URL |
|
|
| `LITELLM_MASTER_KEY` | ✅ | — | LiteLLM master key |
|
|
| `ADMIN_USER_IDS` | ✅ | — | Comma-separated admin user IDs |
|
|
| `EMBEDDING_MODEL` | ❌ | `text-embedding-ada-002` | Default embedding model |
|
|
| `VISION_MODEL` | ❌ | `openai/gpt-4o-mini` | Default vision model |
|
|
|
|
### Upload Parameters
|
|
|
|
| Parameter | Type | Default | Description |
|
|
|-----------|------|---------|-------------|
|
|
| `file` | file | — | File to upload |
|
|
| `chunk_size` | int | 512 | Characters per chunk |
|
|
| `chunk_overlap` | int | 50 | Overlap between chunks |
|
|
| `vision_model` | string | Config default | Vision model for images |
|
|
| `vision_prompt` | string | Auto | Custom prompt for vision model |
|
|
|
|
### Search Parameters
|
|
|
|
| Parameter | Type | Default | Description |
|
|
|-----------|------|---------|-------------|
|
|
| `query` | string | — | Search query |
|
|
| `top_k` | int | 5 | Number of results (max. 50) |
|
|
| `rerank` | bool | false | Enable reranking |
|
|
| `rerank_model` | string | Auto | Custom rerank model |
|
|
|
|
### RAG Parameters
|
|
|
|
| Parameter | Type | Default | Description |
|
|
|-----------|------|---------|-------------|
|
|
| `query` | string | — | Question |
|
|
| `model` | string | cosair/gemma4:31b | LLM model |
|
|
| `top_k` | int | 5 | Context documents |
|
|
| `rerank` | bool | false | Enable reranking |
|
|
| `system_prompt` | string | Auto | Custom system prompt |
|
|
| `messages` | array | [] | Chat history |
|
|
|
|
### Supported File Formats
|
|
|
|
| Format | Extension | Notes |
|
|
|--------|-----------|-------|
|
|
| Text | `.txt` | UTF-8 encoded |
|
|
| Markdown | `.md` | Standard Markdown |
|
|
| PDF | `.pdf` | Text PDFs only, no scans |
|
|
| Word | `.docx` | Microsoft Word 2007+ |
|
|
| Excel | `.xlsx` | All sheets extracted |
|
|
| CSV | `.csv` | All columns extracted |
|
|
| PowerPoint | `.pptx` | All slides extracted |
|
|
| HTML | `.html` `.htm` | Scripts/styles removed |
|
|
| Outlook Mail | `.msg` | Including headers |
|
|
| E-Mail | `.eml` | Including headers |
|
|
| JSON | `.json` | Pretty printed |
|
|
| Image | `.jpg` `.jpeg` `.png` `.gif` `.webp` `.tiff` | Via Vision LLM |
|
|
|
|
### Limits
|
|
|
|
| Limit | Value |
|
|
|-------|-------|
|
|
| Max file size | 256 MB |
|
|
| Max search results | 50 |
|
|
| Request timeout | 600 seconds |
|
|
| Default chunk size | 512 characters |
|
|
| Default chunk overlap | 50 characters |
|
|
|
|
## Admin UI
|
|
|
|
The Admin UI is available at `https://admin.your-domain.com`.
|
|
|
|
Login with your Admin API Key to:
|
|
|
|
- 📊 View usage statistics
|
|
- 👥 Manage users and their stores
|
|
- 🔑 Rotate API keys
|
|
- 🔒 Grant/revoke store permissions
|
|
|
|
## Development
|
|
|
|
```bash
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# Run locally
|
|
DATABASE_URL="postgresql://..." \
|
|
LITELLM_PROXY_URL="http://..." \
|
|
LITELLM_MASTER_KEY="sk-..." \
|
|
ADMIN_USER_IDS="your-id" \
|
|
EMBEDDING_MODEL="your-model" \
|
|
VISION_MODEL="openai/gpt-4o-mini" \
|
|
uvicorn app.main:app --reload
|
|
|
|
# Run UI locally
|
|
cd ui
|
|
npm install
|
|
VITE_API_URL=http://localhost:8000 npm run dev
|
|
```
|
|
|
|
## Tech Stack
|
|
|
|
| Component | Technology |
|
|
|-----------|-----------|
|
|
| **API** | FastAPI + Python 3.12 |
|
|
| **Database** | PostgreSQL 16 + pgvector |
|
|
| **Auth** | LiteLLM Key Management |
|
|
| **Embeddings** | Via LiteLLM Proxy |
|
|
| **Vision** | Via LiteLLM Vision Models |
|
|
| **Admin UI** | React + TypeScript + Tailwind CSS |
|
|
| **Container** | Docker + Kubernetes |
|
|
| **Ingress** | NGINX Ingress Controller |
|
|
| **TLS** | cert-manager + Let's Encrypt |
|
|
|
|
## License
|
|
|
|
MIT License - see [LICENSE](LICENSE) for details.
|
|
|
|
## Contributing
|
|
|
|
1. Fork the repository
|
|
2. Create your feature branch (`git checkout -b feature/my-feature`)
|
|
3. Commit your changes (`git commit -m 'Add my feature'`)
|
|
4. Push to the branch (`git push origin feature/my-feature`)
|
|
5. Open a Pull Request
|