litellm-vector-store/README.md

# litellm-vector-store

A vector store service built on top of [LiteLLM](https://github.com/BerriAI/litellm) and [pgvector](https://github.com/pgvector/pgvector), providing an OpenAI-compatible API for semantic search, document storage and Retrieval Augmented Generation (RAG).

## Features

- 🔐 **Authentication** via LiteLLM API Keys
- 🗄️ **Vector Store** powered by PostgreSQL + pgvector
- 🔍 **Semantic Search** with optional Reranking
- 🤖 **RAG Endpoint** - Search + LLM in one request
- 📄 **File Upload** - PDF, DOCX, TXT, Markdown, Excel, CSV, PowerPoint, HTML, E-Mail, JSON
- 🖼️ **Image Support** - Upload images via Vision LLM (JPG, PNG, GIF, WebP, TIFF)
- 🧩 **OpenAI-compatible API** - works with existing OpenAI SDKs
- 👥 **Multi-User** - Store permissions per user
- 🖥️ **Admin UI** - Manage users, stores and permissions
- 📊 **Usage Tracking** - Track requests per user

## Architecture

```
Client (API Key)
      │
      ▼
LiteLLM Proxy ──────────────────────────────┐
      │                                     │
      ▼                                     ▼
Vector Store API                    LiteLLM Models
      │                          ┌──────────────────┐
      ▼                          │ Embedding Models  │
PostgreSQL + pgvector             │ Vision Models     │
                                  │ LLM Models        │
                                  └──────────────────┘
```

## Requirements

- Kubernetes Cluster
- PostgreSQL with pgvector extension (already deployed)
- LiteLLM Proxy (already deployed)
- Container Registry

## Quick Start

### 1. Clone Repository

```bash
git clone https://github.com/your-org/litellm-vector-store.git
cd litellm-vector-store
```

### 2. Database Setup

```bash
kubectl exec -it <postgres-pod> -n <namespace> \
  -- psql -U postgres -d vectordb << 'EOF'

CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE IF NOT EXISTS vector_stores (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name          VARCHAR(255) NOT NULL,
    owner_user_id VARCHAR(255) NOT NULL,
    created_at    TIMESTAMP DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS documents (
    id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    store_id   UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
    content    TEXT NOT NULL,
    metadata   JSONB DEFAULT '{}',
    embedding  vector(1024),
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS store_permissions (
    store_id   UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
    user_id    VARCHAR(255) NOT NULL,
    permission VARCHAR(50) DEFAULT 'read',
    PRIMARY KEY (store_id, user_id)
);

CREATE TABLE IF NOT EXISTS usage_stats (
    id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id    VARCHAR(255) NOT NULL,
    store_id   UUID REFERENCES vector_stores(id) ON DELETE SET NULL,
    action     VARCHAR(50) NOT NULL,
    tokens     INT DEFAULT 0,
    duration   FLOAT DEFAULT 0,
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX IF NOT EXISTS idx_documents_store
    ON documents(store_id);
CREATE INDEX IF NOT EXISTS idx_documents_embedding
    ON documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);
CREATE INDEX IF NOT EXISTS idx_usage_user
    ON usage_stats(user_id);
CREATE INDEX IF NOT EXISTS idx_usage_created
    ON usage_stats(created_at);

GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO vecuser;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO vecuser;

EOF
```

### 3. Configure

```bash
# Create secrets
kubectl create secret generic vector-api-secrets \
  --namespace vector-store \
  --from-literal=DATABASE_URL="postgresql://vecuser:pass@postgres:5432/vectordb" \
  --from-literal=LITELLM_MASTER_KEY="sk-master-key"
```

```yaml
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: vector-store-config
  namespace: vector-store
data:
  LITELLM_PROXY_URL: "http://litellm.<namespace>.svc.cluster.local:4000"
  ADMIN_USER_IDS:    "your-admin-user-id"
  API_URL:           "https://api.your-domain.com"
  EMBEDDING_MODEL:   "your-embedding-model"
  VISION_MODEL:      "openai/gpt-4o-mini"
```

### 4. Build & Deploy

```bash
# Build & push API
docker build -t your-registry/vector-store-api:1.0.0 .
docker push your-registry/vector-store-api:1.0.0

# Build & push Admin UI
docker build \
  -t your-registry/vector-store-admin:1.0.0 \
  ./ui
docker push your-registry/vector-store-admin:1.0.0

# Deploy
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/vector-api/
kubectl apply -f k8s/admin-ui/
kubectl apply -f k8s/ingress-api.yaml
kubectl apply -f k8s/ingress-ui.yaml
```

## Project Structure

```
litellm-vector-store/
├── app/                          # FastAPI Backend
│   ├── main.py                   # Application entry point
│   ├── auth.py                   # LiteLLM authentication
│   ├── database.py               # PostgreSQL connection
│   ├── models.py                 # Pydantic models
│   ├── routers/
│   │   ├── stores.py             # Vector store CRUD
│   │   ├── documents.py          # Document management
│   │   ├── admin.py              # Admin endpoints
│   │   └── openai_compat.py      # OpenAI-compatible API
│   └── utils/
│       ├── chunking.py           # Text chunking
│       ├── image_processor.py    # Vision LLM integration
│       └── stats.py              # Usage tracking
├── ui/                           # React Admin UI
│   ├── src/
│   │   ├── pages/
│   │   │   ├── Login.tsx
│   │   │   ├── Dashboard.tsx
│   │   │   ├── Users.tsx
│   │   │   └── Stores.tsx
│   │   ├── components/
│   │   │   ├── Layout.tsx
│   │   │   └── PermissionModal.tsx
│   │   └── api/
│   │       └── client.ts
│   └── Dockerfile
├── k8s/                          # Kubernetes manifests
│   ├── namespace.yaml
│   ├── configmap.yaml
│   ├── secrets.yaml
│   ├── vector-api/
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   ├── admin-ui/
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   ├── ingress-api.yaml
│   └── ingress-ui.yaml
├── scripts/
│   └── init.sql                  # Database initialization
├── Dockerfile
├── requirements.txt
└── README.md
```

## API Reference

### Base URL

```
https://api.your-domain.com/v1
```

### Authentication

```
Authorization: Bearer sk-your-api-key
```

### Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/v1/models` | List all models |
| `GET` | `/v1/embeddings/models` | List embedding models |
| `GET` | `/v1/vision/models` | List vision models |
| `POST` | `/v1/embeddings` | Create embeddings |
| `POST` | `/v1/vector_stores` | Create store |
| `GET` | `/v1/vector_stores` | List stores |
| `GET` | `/v1/vector_stores/{id}` | Get store |
| `DELETE` | `/v1/vector_stores/{id}` | Delete store |
| `POST` | `/v1/vector_stores/{id}/files` | Add texts |
| `GET` | `/v1/vector_stores/{id}/files` | List files |
| `DELETE` | `/v1/vector_stores/{id}/files/{file_id}` | Delete file |
| `POST` | `/v1/vector_stores/{id}/upload` | Upload file or image |
| `POST` | `/v1/vector_stores/{id}/search` | Semantic search |
| `POST` | `/v1/vector_stores/{id}/rag` | RAG query |

### Examples

#### Store anlegen & Datei hochladen

```python
import httpx

client = httpx.Client(
    base_url="https://api.your-domain.com/v1",
    headers={"Authorization": "Bearer sk-your-key"},
    timeout=120.0
)

# Create store
store = client.post(
    "/vector_stores",
    json={"name": "My Knowledge Base"}
).json()

# Upload document
with open("document.pdf", "rb") as f:
    client.post(
        f"/vector_stores/{store['id']}/upload",
        files={"file": f}
    )

# Upload image (with default vision model)
with open("screenshot.png", "rb") as f:
    client.post(
        f"/vector_stores/{store['id']}/upload",
        files={"file": f}
    )

# Upload image (with custom vision model)
with open("diagram.png", "rb") as f:
    client.post(
        f"/vector_stores/{store['id']}/upload",
        files={"file": f},
        data={
            "vision_model":  "openai/gpt-4o",
            "vision_prompt": "Explain this diagram in detail."
        }
    )

# Search
results = client.post(
    f"/vector_stores/{store['id']}/search",
    json={
        "query":  "What is FastAPI?",
        "top_k":  3,
        "rerank": True
    }
).json()

# RAG
answer = client.post(
    f"/vector_stores/{store['id']}/rag",
    json={
        "query":  "What is FastAPI?",
        "model":  "openai/gpt-4o-mini",
        "rerank": True
    }
).json()
print(answer["answer"])
```

#### JavaScript / TypeScript

```javascript
const API_KEY  = "sk-your-api-key";
const BASE_URL = "https://api.your-domain.com/v1";
const HEADERS  = {
    "Authorization": `Bearer ${API_KEY}`,
    "Content-Type":  "application/json"
};

// Create store
const store = await fetch(`${BASE_URL}/vector_stores`, {
    method:  "POST",
    headers: HEADERS,
    body:    JSON.stringify({ name: "My Store" })
}).then(r => r.json());

// Search
const results = await fetch(
    `${BASE_URL}/vector_stores/${store.id}/search`, {
    method:  "POST",
    headers: HEADERS,
    body:    JSON.stringify({
        query:  "What is FastAPI?",
        top_k:  3,
        rerank: true
    })
}).then(r => r.json());

// RAG
const answer = await fetch(
    `${BASE_URL}/vector_stores/${store.id}/rag`, {
    method:  "POST",
    headers: HEADERS,
    body:    JSON.stringify({
        query: "What is FastAPI?"
    })
}).then(r => r.json());

console.log(answer.answer);
```

#### curl

```bash
# Create store
curl -X POST https://api.your-domain.com/v1/vector_stores \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Store"}'

# Upload document
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/upload \
  -H "Authorization: Bearer sk-your-key" \
  -F "file=@document.pdf"

# Upload image with custom vision model
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/upload \
  -H "Authorization: Bearer sk-your-key" \
  -F "file=@diagram.png" \
  -F "vision_model=openai/gpt-4o" \
  -F "vision_prompt=Explain this diagram in detail."

# Search
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/search \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is FastAPI?", "top_k": 3, "rerank": true}'

# RAG
curl -X POST https://api.your-domain.com/v1/vector_stores/{store_id}/rag \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is FastAPI?", "model": "openai/gpt-4o-mini"}'
```

## Configuration Reference

### Environment Variables

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `DATABASE_URL` | ✅ | — | PostgreSQL connection URL |
| `LITELLM_PROXY_URL` | ✅ | — | LiteLLM proxy URL |
| `LITELLM_MASTER_KEY` | ✅ | — | LiteLLM master key |
| `ADMIN_USER_IDS` | ✅ | — | Comma-separated admin user IDs |
| `EMBEDDING_MODEL` | ❌ | `text-embedding-ada-002` | Default embedding model |
| `VISION_MODEL` | ❌ | `openai/gpt-4o-mini` | Default vision model |

### Upload Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `file` | file | — | File to upload |
| `chunk_size` | int | 512 | Characters per chunk |
| `chunk_overlap` | int | 50 | Overlap between chunks |
| `vision_model` | string | Config default | Vision model for images |
| `vision_prompt` | string | Auto | Custom prompt for vision model |

### Search Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | string | — | Search query |
| `top_k` | int | 5 | Number of results (max. 50) |
| `rerank` | bool | false | Enable reranking |
| `rerank_model` | string | Auto | Custom rerank model |

### RAG Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | string | — | Question |
| `model` | string | cosair/gemma4:31b | LLM model |
| `top_k` | int | 5 | Context documents |
| `rerank` | bool | false | Enable reranking |
| `system_prompt` | string | Auto | Custom system prompt |
| `messages` | array | [] | Chat history |

### Supported File Formats

| Format | Extension | Notes |
|--------|-----------|-------|
| Text | `.txt` | UTF-8 encoded |
| Markdown | `.md` | Standard Markdown |
| PDF | `.pdf` | Text PDFs only, no scans |
| Word | `.docx` | Microsoft Word 2007+ |
| Excel | `.xlsx` | All sheets extracted |
| CSV | `.csv` | All columns extracted |
| PowerPoint | `.pptx` | All slides extracted |
| HTML | `.html` `.htm` | Scripts/styles removed |
| Outlook Mail | `.msg` | Including headers |
| E-Mail | `.eml` | Including headers |
| JSON | `.json` | Pretty printed |
| Image | `.jpg` `.jpeg` `.png` `.gif` `.webp` `.tiff` | Via Vision LLM |

### Limits

| Limit | Value |
|-------|-------|
| Max file size | 256 MB |
| Max search results | 50 |
| Request timeout | 600 seconds |
| Default chunk size | 512 characters |
| Default chunk overlap | 50 characters |

## Admin UI

The Admin UI is available at `https://admin.your-domain.com`.

Login with your Admin API Key to:

- 📊 View usage statistics
- 👥 Manage users and their stores
- 🔑 Rotate API keys
- 🔒 Grant/revoke store permissions

## Development

```bash
# Install dependencies
pip install -r requirements.txt

# Run locally
DATABASE_URL="postgresql://..." \
LITELLM_PROXY_URL="http://..." \
LITELLM_MASTER_KEY="sk-..." \
ADMIN_USER_IDS="your-id" \
EMBEDDING_MODEL="your-model" \
VISION_MODEL="openai/gpt-4o-mini" \
uvicorn app.main:app --reload

# Run UI locally
cd ui
npm install
VITE_API_URL=http://localhost:8000 npm run dev
```

## Tech Stack

| Component | Technology |
|-----------|-----------|
| **API** | FastAPI + Python 3.12 |
| **Database** | PostgreSQL 16 + pgvector |
| **Auth** | LiteLLM Key Management |
| **Embeddings** | Via LiteLLM Proxy |
| **Vision** | Via LiteLLM Vision Models |
| **Admin UI** | React + TypeScript + Tailwind CSS |
| **Container** | Docker + Kubernetes |
| **Ingress** | NGINX Ingress Controller |
| **TLS** | cert-manager + Let's Encrypt |

## License

MIT License - see [LICENSE](LICENSE) for details.

## Contributing

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/my-feature`)
3. Commit your changes (`git commit -m 'Add my feature'`)
4. Push to the branch (`git push origin feature/my-feature`)
5. Open a Pull Request