Architecture

RAG Pipeline

The assistant processes user queries through a Retrieval-Augmented Generation pipeline:

Upload — files are uploaded via POST /upload, parsed, and chunked
Index — chunks are embedded and stored in Qdrant as vector documents
Chat — at query time, relevant chunks are retrieved and injected into the OpenAI prompt
Stream — the response is streamed back to the client via SSE

Component Overview

Component	Module	Role
FastAPI application	`main.py`	HTTP server, route registration, lifespan hooks
RAG pipeline	`rag.py`	LlamaIndex index, retrieval, query engine
Ingestion	`ingest.py`	Parses uploads, splits into nodes, upserts to Qdrant
Auth	`auth.py`	JWKS-based JWT verification, identity extraction
History	`history.py`	Conversation and message persistence
Uploads	`uploads.py`	File storage, attachment management
OpenAI streaming	`openai_stream.py`	SSE token streaming with LlamaIndex
Vision	`openai_vision.py`	Image attachment support in chat messages
Qdrant setup	`qdrant_setup.py`	Collection initialization and management
Settings	`settings.py`	Pydantic-settings environment configuration

Frontend

This repository is a pure API backend. The chat UI is maintained separately in celine-frontend:

packages/assistant-ui — @celine-eu/assistant-ui Svelte component library (ChatCore, AssistantWidget, etc.)
apps/assistant — standalone full-page assistant app

The frontend communicates with this API at apiBaseUrl. When deployed inside the participant webapp, requests are proxied through the celine-webapp BFF.

Service Dependencies

Service	Purpose
OpenAI	Chat completions and text embeddings
Qdrant	Vector storage and similarity search
PostgreSQL	Conversation history and attachment metadata
S3 / object storage	Raw file storage for uploads
Keycloak / JWKS endpoint	JWT public key discovery for auth

Data Flow

Upload path:

POST /upload → parse file → split into chunks → embed (OpenAI) → upsert (Qdrant) → store metadata (PostgreSQL)

Chat path:

POST /chat → verify JWT → load history → retrieve context (Qdrant) → build prompt → stream (OpenAI SSE)

Database Models

Conversations and messages are stored in PostgreSQL using SQLAlchemy async models. Alembic handles schema migrations. The db/models.py module defines:

Conversation — linked to a user identity from the JWT subject claim
Message — role (user/assistant), content, timestamp, optional attachment refs
Attachment — file metadata: name, MIME type, Qdrant collection reference