Files
litellm-vector-store/README.md
2026-04-29 08:17:35 +00:00

9.8 KiB

litellm-vector-store

A vector store service built on top of LiteLLM and pgvector, providing an OpenAI-compatible API for semantic search, document storage and Retrieval Augmented Generation (RAG).

Features

  • 🔐 Authentication via LiteLLM API Keys
  • 🗄️ Vector Store powered by PostgreSQL + pgvector
  • 🔍 Semantic Search with optional Reranking
  • 🤖 RAG Endpoint - Search + LLM in one request
  • 📄 File Upload - PDF, DOCX, TXT, Markdown
  • 🧩 OpenAI-compatible API - works with existing OpenAI SDKs
  • 👥 Multi-User - Store permissions per user
  • 🖥️ Admin UI - Manage users, stores and permissions
  • 📊 Usage Tracking - Track requests per user

Architecture

Client (API Key)
      │
      ▼
LiteLLM Proxy ──────────────────────┐
      │                             │
      ▼                             ▼
Vector Store API            Embedding Models
      │                         (via LiteLLM)
      ▼
PostgreSQL + pgvector

Requirements

  • Kubernetes Cluster
  • PostgreSQL with pgvector extension
  • LiteLLM Proxy (deployed)
  • Container Registry

Quick Start

1. Clone Repository

git clone https://github.com/your-org/litellm-vector-store.git
cd litellm-vector-store

2. Database Setup

kubectl exec -it <postgres-pod> -n <namespace> \
  -- psql -U postgres -d vectordb << 'EOF'

CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE IF NOT EXISTS vector_stores (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name          VARCHAR(255) NOT NULL,
    owner_user_id VARCHAR(255) NOT NULL,
    created_at    TIMESTAMP DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS documents (
    id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    store_id   UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
    content    TEXT NOT NULL,
    metadata   JSONB DEFAULT '{}',
    embedding  vector(1024),
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS store_permissions (
    store_id   UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
    user_id    VARCHAR(255) NOT NULL,
    permission VARCHAR(50) DEFAULT 'read',
    PRIMARY KEY (store_id, user_id)
);

CREATE TABLE IF NOT EXISTS usage_stats (
    id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id    VARCHAR(255) NOT NULL,
    store_id   UUID REFERENCES vector_stores(id) ON DELETE SET NULL,
    action     VARCHAR(50) NOT NULL,
    tokens     INT DEFAULT 0,
    duration   FLOAT DEFAULT 0,
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX IF NOT EXISTS idx_documents_store
    ON documents(store_id);
CREATE INDEX IF NOT EXISTS idx_documents_embedding
    ON documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);
CREATE INDEX IF NOT EXISTS idx_usage_user
    ON usage_stats(user_id);
CREATE INDEX IF NOT EXISTS idx_usage_created
    ON usage_stats(created_at);

GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO vecuser;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO vecuser;

EOF

3. Configure

kubectl create secret generic vector-api-secrets \
  --namespace vector-store \
  --from-literal=DATABASE_URL="postgresql://vecuser:pass@postgres:5432/vectordb" \
  --from-literal=LITELLM_MASTER_KEY="sk-master-key"
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: vector-store-config
  namespace: vector-store
data:
  LITELLM_PROXY_URL: "http://litellm.<namespace>.svc.cluster.local:4000"
  ADMIN_USER_IDS:    "your-admin-user-id"
  API_URL:           "https://api.your-domain.com"
  EMBEDDING_MODEL:   "your-embedding-model"

4. Build & Deploy

# API
docker build -t your-registry/vector-store-api:1.0.0 .
docker push your-registry/vector-store-api:1.0.0

# Admin UI
docker build \
  -t your-registry/vector-store-admin:1.0.0 \
  ./ui
docker push your-registry/vector-store-admin:1.0.0

# Deploy
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/vector-api/
kubectl apply -f k8s/admin-ui/
kubectl apply -f k8s/ingress-api.yaml
kubectl apply -f k8s/ingress-ui.yaml

Project Structure

litellm-vector-store/
├── app/                          # FastAPI Backend
│   ├── main.py                   # Application entry point
│   ├── auth.py                   # LiteLLM authentication
│   ├── database.py               # PostgreSQL connection
│   ├── models.py                 # Pydantic models
│   ├── routers/
│   │   ├── stores.py             # Vector store CRUD
│   │   ├── documents.py          # Document management
│   │   ├── admin.py              # Admin endpoints
│   │   └── openai_compat.py      # OpenAI-compatible API
│   └── utils/
│       ├── chunking.py           # Text chunking
│       └── stats.py              # Usage tracking
├── ui/                           # React Admin UI
│   ├── src/
│   │   ├── pages/
│   │   │   ├── Login.tsx
│   │   │   ├── Dashboard.tsx
│   │   │   ├── Users.tsx
│   │   │   └── Stores.tsx
│   │   ├── components/
│   │   │   ├── Layout.tsx
│   │   │   └── PermissionModal.tsx
│   │   └── api/
│   │       └── client.ts
│   └── Dockerfile
├── k8s/                          # Kubernetes manifests
│   ├── namespace.yaml
│   ├── configmap.yaml
│   ├── secrets.yaml
│   ├── vector-api/
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   ├── admin-ui/
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   ├── ingress-api.yaml
│   └── ingress-ui.yaml
├── scripts/
│   └── init.sql                  # Database initialization
├── Dockerfile
├── requirements.txt
└── README.md

API Reference

Base URL

https://api.your-domain.com/v1

Authentication

Authorization: Bearer sk-your-api-key

Endpoints

Method Endpoint Description
POST /v1/vector_stores Create store
GET /v1/vector_stores List stores
GET /v1/vector_stores/{id} Get store
DELETE /v1/vector_stores/{id} Delete store
POST /v1/vector_stores/{id}/files Add texts
GET /v1/vector_stores/{id}/files List files
DELETE /v1/vector_stores/{id}/files/{file_id} Delete file
POST /v1/vector_stores/{id}/upload Upload file
POST /v1/vector_stores/{id}/search Search
POST /v1/vector_stores/{id}/rag RAG query
POST /v1/embeddings Create embeddings
GET /v1/embeddings/models List embedding models
GET /v1/models List all models

Example

import httpx

client = httpx.Client(
    base_url="https://api.your-domain.com/v1",
    headers={"Authorization": "Bearer sk-your-key"}
)

# Create store
store = client.post(
    "/vector_stores",
    json={"name": "My Knowledge Base"}
).json()

# Upload file
with open("document.pdf", "rb") as f:
    client.post(
        f"/vector_stores/{store['id']}/upload",
        files={"file": f}
    )

# Search
results = client.post(
    f"/vector_stores/{store['id']}/search",
    json={"query": "What is FastAPI?", "top_k": 3}
).json()

# RAG
answer = client.post(
    f"/vector_stores/{store['id']}/rag",
    json={"query": "What is FastAPI?"}
).json()
print(answer["answer"])

Configuration Reference

Environment Variables

Variable Required Default Description
DATABASE_URL PostgreSQL connection URL
LITELLM_PROXY_URL LiteLLM proxy URL
LITELLM_MASTER_KEY LiteLLM master key
ADMIN_USER_IDS Comma-separated admin user IDs
EMBEDDING_MODEL text-embedding-ada-002 Default embedding model

Supported File Formats

Format Extension Notes
Text .txt UTF-8 encoded
PDF .pdf Text PDFs only, no scans
Word .docx Microsoft Word 2007+
Markdown .md Standard Markdown

Limits

Limit Value
Max file size 256 MB
Max search results 50
Request timeout 600 seconds
Default chunk size 512 characters
Default chunk overlap 50 characters

Admin UI

The Admin UI is available at https://admin.your-domain.com.

Login with your Admin API Key to:

  • 📊 View usage statistics
  • 👥 Manage users and their stores
  • 🔑 Rotate API keys
  • 🔒 Grant/revoke store permissions

Development

# Install dependencies
pip install -r requirements.txt

# Run locally
DATABASE_URL="postgresql://..." \
LITELLM_PROXY_URL="http://..." \
LITELLM_MASTER_KEY="sk-..." \
ADMIN_USER_IDS="your-id" \
uvicorn app.main:app --reload

# Run UI locally
cd ui
npm install
VITE_API_URL=http://localhost:8000 npm run dev

Tech Stack

Component Technology
API FastAPI + Python 3.12
Database PostgreSQL 16 + pgvector
Auth LiteLLM Key Management
Embeddings Via LiteLLM Proxy
Admin UI React + TypeScript + Tailwind CSS
Container Docker + Kubernetes
Ingress NGINX Ingress Controller
TLS cert-manager + Let's Encrypt

License

MIT License - see LICENSE for details.

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/my-feature)
  3. Commit your changes (git commit -m 'Add my feature')
  4. Push to the branch (git push origin feature/my-feature)
  5. Open a Pull Request