Files & RAG

How CoreAI turns uploaded files into retrieval-ready context for chat and assistants.

Files & RAG in CoreAI is the capability chain that turns uploaded documents into grounded context for chat, assistants, and other retrieval-enabled workflows.

At a high level, the flow is:

a user uploads a file through the platform
the backend stores the source file and creates a library record
a background worker chunks and processes the document
embeddings are generated and stored in Milvus
chat and assistant flows retrieve matching chunks later through RAG

What This Capability Covers

This capability is broader than file upload alone.

It includes:

the user-facing file library in the CoreAI Portal
file metadata and status tracking in the backend
source-file storage in S3-compatible object storage such as MinIO
background indexing through Temporal
document chunking and normalization through Docling
vector storage and similarity search in Milvus
retrieval use inside chat and assistant flows

User-Facing Entry Points

The main user-facing entry points are:

the Files section in the CoreAI Portal
the chat RAG picker in the portal composer

The Files area is where users upload, organize, rename, delete, and reprocess content.

The chat RAG control is where users decide which files or folders should be used as retrieval context for a conversation.

Core Processing Flow

The backend implementation currently spans multiple runtime layers.

Where The Data Lives

Different parts of the capability live in different systems.

Data type	Main system
Original uploaded file	`S3`-compatible storage such as `MinIO`
File library metadata and status	`PostgreSQL`
Chunk JSON and derived artifacts	`{bucket}-artifacts` object-storage bucket
Searchable vectors and retrieval payloads	`Milvus`

This boundary matters.

The file row shown in the portal is not the retrieval index itself. The portal library is driven by database metadata, while actual retrieval depends on vectorized chunks in Milvus.

Status Lifecycle

Files move through a visible RAG lifecycle.

rag_pending: upload succeeded and background indexing is queued
rag_processing: the worker is actively chunking, embedding, and indexing the file
rag_available: the file is indexed and can be used for retrieval
rag_failed: indexing failed
rag_removed: the file was removed from the retrieval lifecycle

This is why upload success is not the same as retrieval readiness.

A file can appear in the portal library before it is usable in grounded chat.

How Uploads Become Retrieval-Ready

The backend upload path currently uses:

POST /v1/files/upload
POST /v1/files/upload-stream

These endpoints do more than store bytes.

They also:

compute a deterministic object key
create or update the corresponding library record
set the initial status to rag_pending
start a Temporal workflow on file-events-queue

The actual indexing work does not happen inside the upload request itself.

Instead, a separate file-event worker performs the expensive background steps later. If that worker is not running, files can remain stuck in rag_pending even though the upload returned success.

Docling, Chunking, And Embeddings

For standard document formats, the worker uses Docling-based processing to chunk and normalize the source material.

That processing can also produce additional artifacts such as chunk JSON and, where applicable, metadata, images, or tables.

After chunking, embeddings are generated for the chunks and inserted into Milvus.

Those vectors are what later power retrieval.

How Chat Uses Files Later

The portal chat composer has a RAG control that searches the file library and lets users pick:

individual files
folders
all available retrieval-ready content

Only files with rag_available status are surfaced as directly selectable retrieval files in that picker.

When a user submits a chat request with retrieval enabled, the UI sends:

fileSearch
optional documentIds
optional folderIds

The backend then resolves the effective retrieval scope.

Folder selections are expanded on the backend into the underlying file IDs for that user, and the final retrieval step uses those document IDs when running similarity search.

This makes the flow user-friendly in the portal while still keeping retrieval scoped to concrete indexed documents under the hood.

Retrieval Boundary

The retrieval path is currently centered around the backend RetrievalService.

In practice, it:

searches a configured Milvus collection
optionally filters by document IDs
can apply a user-specific filter
returns chunk text that is assembled into model context

The chat-oriented file_search tool uses this retrieval layer so that the assistant can ground responses in the selected content instead of relying only on base-model knowledge.

Files, Folders, And Scope

The platform treats files and folders differently:

files are the actual indexed retrieval units
folders are a user-facing grouping and selection mechanism

That means selecting a folder in chat is really a convenient way to include the files contained in that folder hierarchy.

This also explains why the Files area and the chat RAG picker are so closely connected.

Reprocess And Delete

The Files area is not only for initial upload.

It also supports later lifecycle actions.

Reprocess

Reprocess reruns indexing for an already stored source file.

This is useful when:

indexing previously failed
processing settings changed
the vector representation needs to be rebuilt

Reprocess does not require the user to upload the file again.

Delete

Delete removes the file from the library lifecycle and triggers cleanup behavior for the indexed content.

From a user perspective, the important point is that delete is not only a UI metadata operation. It also participates in the broader storage and retrieval cleanup path.

Operational Realities

The current implementation has a few practical behaviors that matter to operators and advanced users:

indexing is asynchronous
upload success does not guarantee retrieval success
Temporal, Docling, object storage, and Milvus all need to be available for the full pipeline to complete
only /v1/files/upload and /v1/files/upload-stream enter the main RAG indexing path
a failed reprocess can temporarily leave a file without active vectors until indexing succeeds again

For day-to-day portal use, the most important takeaway is simple: wait for RAG Available before expecting grounded chat answers from a newly uploaded document.

Why This Matters In CoreAI

This capability is one of the main reasons CoreAI can support grounded assistants and enterprise document workflows.

Without it, chat would only have direct model generation.

With it, the platform can:

ground answers in uploaded content
scope retrieval to specific files or folders
connect document management with conversational AI
support more governed and inspectable knowledge workflows

Files & RAG

On this page