Files

root ef55253cbd Initial commit

2026-04-29 08:17:35 +00:00

9.8 KiB

Raw Blame History

litellm-vector-store

A vector store service built on top of LiteLLM and pgvector, providing an OpenAI-compatible API for semantic search, document storage and Retrieval Augmented Generation (RAG).

Features

🔐 Authentication via LiteLLM API Keys
🗄️ Vector Store powered by PostgreSQL + pgvector
🔍 Semantic Search with optional Reranking
🤖 RAG Endpoint - Search + LLM in one request
📄 File Upload - PDF, DOCX, TXT, Markdown
🧩 OpenAI-compatible API - works with existing OpenAI SDKs
👥 Multi-User - Store permissions per user
🖥️ Admin UI - Manage users, stores and permissions
📊 Usage Tracking - Track requests per user

Architecture

Client (API Key)
      │
      ▼
LiteLLM Proxy ──────────────────────┐
      │                             │
      ▼                             ▼
Vector Store API            Embedding Models
      │                         (via LiteLLM)
      ▼
PostgreSQL + pgvector

Requirements

Kubernetes Cluster
PostgreSQL with pgvector extension
LiteLLM Proxy (deployed)
Container Registry

Quick Start

1. Clone Repository

git clone https://github.com/your-org/litellm-vector-store.git
cd litellm-vector-store

2. Database Setup

kubectl exec -it <postgres-pod> -n <namespace> \
  -- psql -U postgres -d vectordb << 'EOF'

CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE IF NOT EXISTS vector_stores (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name          VARCHAR(255) NOT NULL,
    owner_user_id VARCHAR(255) NOT NULL,
    created_at    TIMESTAMP DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS documents (
    id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    store_id   UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
    content    TEXT NOT NULL,
    metadata   JSONB DEFAULT '{}',
    embedding  vector(1024),
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS store_permissions (
    store_id   UUID REFERENCES vector_stores(id) ON DELETE CASCADE,
    user_id    VARCHAR(255) NOT NULL,
    permission VARCHAR(50) DEFAULT 'read',
    PRIMARY KEY (store_id, user_id)
);

CREATE TABLE IF NOT EXISTS usage_stats (
    id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id    VARCHAR(255) NOT NULL,
    store_id   UUID REFERENCES vector_stores(id) ON DELETE SET NULL,
    action     VARCHAR(50) NOT NULL,
    tokens     INT DEFAULT 0,
    duration   FLOAT DEFAULT 0,
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX IF NOT EXISTS idx_documents_store
    ON documents(store_id);
CREATE INDEX IF NOT EXISTS idx_documents_embedding
    ON documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);
CREATE INDEX IF NOT EXISTS idx_usage_user
    ON usage_stats(user_id);
CREATE INDEX IF NOT EXISTS idx_usage_created
    ON usage_stats(created_at);

GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO vecuser;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO vecuser;

EOF

3. Configure

kubectl create secret generic vector-api-secrets \
  --namespace vector-store \
  --from-literal=DATABASE_URL="postgresql://vecuser:pass@postgres:5432/vectordb" \
  --from-literal=LITELLM_MASTER_KEY="sk-master-key"

# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: vector-store-config
  namespace: vector-store
data:
  LITELLM_PROXY_URL: "http://litellm.<namespace>.svc.cluster.local:4000"
  ADMIN_USER_IDS:    "your-admin-user-id"
  API_URL:           "https://api.your-domain.com"
  EMBEDDING_MODEL:   "your-embedding-model"

4. Build & Deploy

# API
docker build -t your-registry/vector-store-api:1.0.0 .
docker push your-registry/vector-store-api:1.0.0

# Admin UI
docker build \
  -t your-registry/vector-store-admin:1.0.0 \
  ./ui
docker push your-registry/vector-store-admin:1.0.0

# Deploy
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/vector-api/
kubectl apply -f k8s/admin-ui/
kubectl apply -f k8s/ingress-api.yaml
kubectl apply -f k8s/ingress-ui.yaml

Project Structure

litellm-vector-store/
├── app/                          # FastAPI Backend
│   ├── main.py                   # Application entry point
│   ├── auth.py                   # LiteLLM authentication
│   ├── database.py               # PostgreSQL connection
│   ├── models.py                 # Pydantic models
│   ├── routers/
│   │   ├── stores.py             # Vector store CRUD
│   │   ├── documents.py          # Document management
│   │   ├── admin.py              # Admin endpoints
│   │   └── openai_compat.py      # OpenAI-compatible API
│   └── utils/
│       ├── chunking.py           # Text chunking
│       └── stats.py              # Usage tracking
├── ui/                           # React Admin UI
│   ├── src/
│   │   ├── pages/
│   │   │   ├── Login.tsx
│   │   │   ├── Dashboard.tsx
│   │   │   ├── Users.tsx
│   │   │   └── Stores.tsx
│   │   ├── components/
│   │   │   ├── Layout.tsx
│   │   │   └── PermissionModal.tsx
│   │   └── api/
│   │       └── client.ts
│   └── Dockerfile
├── k8s/                          # Kubernetes manifests
│   ├── namespace.yaml
│   ├── configmap.yaml
│   ├── secrets.yaml
│   ├── vector-api/
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   ├── admin-ui/
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   ├── ingress-api.yaml
│   └── ingress-ui.yaml
├── scripts/
│   └── init.sql                  # Database initialization
├── Dockerfile
├── requirements.txt
└── README.md

API Reference

Base URL

https://api.your-domain.com/v1

Authentication

Authorization: Bearer sk-your-api-key

Endpoints

Method	Endpoint	Description
`POST`	`/v1/vector_stores`	Create store
`GET`	`/v1/vector_stores`	List stores
`GET`	`/v1/vector_stores/{id}`	Get store
`DELETE`	`/v1/vector_stores/{id}`	Delete store
`POST`	`/v1/vector_stores/{id}/files`	Add texts
`GET`	`/v1/vector_stores/{id}/files`	List files
`DELETE`	`/v1/vector_stores/{id}/files/{file_id}`	Delete file
`POST`	`/v1/vector_stores/{id}/upload`	Upload file
`POST`	`/v1/vector_stores/{id}/search`	Search
`POST`	`/v1/vector_stores/{id}/rag`	RAG query
`POST`	`/v1/embeddings`	Create embeddings
`GET`	`/v1/embeddings/models`	List embedding models
`GET`	`/v1/models`	List all models

Example

import httpx

client = httpx.Client(
    base_url="https://api.your-domain.com/v1",
    headers={"Authorization": "Bearer sk-your-key"}
)

# Create store
store = client.post(
    "/vector_stores",
    json={"name": "My Knowledge Base"}
).json()

# Upload file
with open("document.pdf", "rb") as f:
    client.post(
        f"/vector_stores/{store['id']}/upload",
        files={"file": f}
    )

# Search
results = client.post(
    f"/vector_stores/{store['id']}/search",
    json={"query": "What is FastAPI?", "top_k": 3}
).json()

# RAG
answer = client.post(
    f"/vector_stores/{store['id']}/rag",
    json={"query": "What is FastAPI?"}
).json()
print(answer["answer"])

Configuration Reference

Environment Variables

Variable	Required	Default	Description
`DATABASE_URL`	✅	—	PostgreSQL connection URL
`LITELLM_PROXY_URL`	✅	—	LiteLLM proxy URL
`LITELLM_MASTER_KEY`	✅	—	LiteLLM master key
`ADMIN_USER_IDS`	✅	—	Comma-separated admin user IDs
`EMBEDDING_MODEL`	❌	`text-embedding-ada-002`	Default embedding model

Supported File Formats

Format	Extension	Notes
Text	`.txt`	UTF-8 encoded
PDF	`.pdf`	Text PDFs only, no scans
Word	`.docx`	Microsoft Word 2007+
Markdown	`.md`	Standard Markdown

Limits

Limit	Value
Max file size	256 MB
Max search results	50
Request timeout	600 seconds
Default chunk size	512 characters
Default chunk overlap	50 characters

Admin UI

The Admin UI is available at https://admin.your-domain.com.

📊 View usage statistics
👥 Manage users and their stores
🔑 Rotate API keys
🔒 Grant/revoke store permissions

Development

# Install dependencies
pip install -r requirements.txt

# Run locally
DATABASE_URL="postgresql://..." \
LITELLM_PROXY_URL="http://..." \
LITELLM_MASTER_KEY="sk-..." \
ADMIN_USER_IDS="your-id" \
uvicorn app.main:app --reload

# Run UI locally
cd ui
npm install
VITE_API_URL=http://localhost:8000 npm run dev

Tech Stack

Component	Technology
API	FastAPI + Python 3.12
Database	PostgreSQL 16 + pgvector
Auth	LiteLLM Key Management
Embeddings	Via LiteLLM Proxy
Admin UI	React + TypeScript + Tailwind CSS
Container	Docker + Kubernetes
Ingress	NGINX Ingress Controller
TLS	cert-manager + Let's Encrypt

License

MIT License - see LICENSE for details.

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/my-feature)
Commit your changes (git commit -m 'Add my feature')
Push to the branch (git push origin feature/my-feature)
Open a Pull Request

9.8 KiB Raw Blame History