Files & RAG

How CoreAI turns uploaded files into retrieval-ready context for chat and assistants.

Agentic Friendly

Files & RAG in CoreAI is the capability chain that turns uploaded documents into grounded context for chat, assistants, and other retrieval-enabled workflows.

At a high level, the flow is:

  1. a user uploads a file through the platform
  2. the backend stores the source file and creates a library record
  3. a background worker chunks and processes the document
  4. embeddings are generated and stored in Milvus
  5. chat and assistant flows retrieve matching chunks later through RAG

What This Capability Covers

This capability is broader than file upload alone.

It includes:

  • the user-facing file library in the CoreAI Portal
  • file metadata and status tracking in the backend
  • source-file storage in S3-compatible object storage such as MinIO
  • background indexing through Temporal
  • document chunking and normalization through Docling
  • vector storage and similarity search in Milvus
  • retrieval use inside chat and assistant flows

User-Facing Entry Points

The main user-facing entry points are:

  • the Files section in the CoreAI Portal
  • the chat RAG picker in the portal composer

The Files area is where users upload, organize, rename, delete, and reprocess content.

The chat RAG control is where users decide which files or folders should be used as retrieval context for a conversation.

Core Processing Flow

The backend implementation currently spans multiple runtime layers.

Where The Data Lives

Different parts of the capability live in different systems.

Data typeMain system
Original uploaded fileS3-compatible storage such as MinIO
File library metadata and statusPostgreSQL
Chunk JSON and derived artifacts{bucket}-artifacts object-storage bucket
Searchable vectors and retrieval payloadsMilvus

This boundary matters.

The file row shown in the portal is not the retrieval index itself. The portal library is driven by database metadata, while actual retrieval depends on vectorized chunks in Milvus.

Status Lifecycle

Files move through a visible RAG lifecycle.

  • rag_pending: upload succeeded and background indexing is queued
  • rag_processing: the worker is actively chunking, embedding, and indexing the file
  • rag_available: the file is indexed and can be used for retrieval
  • rag_failed: indexing failed
  • rag_removed: the file was removed from the retrieval lifecycle

This is why upload success is not the same as retrieval readiness.

A file can appear in the portal library before it is usable in grounded chat.

How Uploads Become Retrieval-Ready

The backend upload path currently uses:

  • POST /v1/files/upload
  • POST /v1/files/upload-stream

These endpoints do more than store bytes.

They also:

  • compute a deterministic object key
  • create or update the corresponding library record
  • set the initial status to rag_pending
  • start a Temporal workflow on file-events-queue

The actual indexing work does not happen inside the upload request itself.

Instead, a separate file-event worker performs the expensive background steps later. If that worker is not running, files can remain stuck in rag_pending even though the upload returned success.

Docling, Chunking, And Embeddings

For standard document formats, the worker uses Docling-based processing to chunk and normalize the source material.

That processing can also produce additional artifacts such as chunk JSON and, where applicable, metadata, images, or tables.

After chunking, embeddings are generated for the chunks and inserted into Milvus.

Those vectors are what later power retrieval.

How Chat Uses Files Later

The portal chat composer has a RAG control that searches the file library and lets users pick:

  • individual files
  • folders
  • all available retrieval-ready content

Only files with rag_available status are surfaced as directly selectable retrieval files in that picker.

When a user submits a chat request with retrieval enabled, the UI sends:

  • fileSearch
  • optional documentIds
  • optional folderIds

The backend then resolves the effective retrieval scope.

Folder selections are expanded on the backend into the underlying file IDs for that user, and the final retrieval step uses those document IDs when running similarity search.

This makes the flow user-friendly in the portal while still keeping retrieval scoped to concrete indexed documents under the hood.

Retrieval Boundary

The retrieval path is currently centered around the backend RetrievalService.

In practice, it:

  • searches a configured Milvus collection
  • optionally filters by document IDs
  • can apply a user-specific filter
  • returns chunk text that is assembled into model context

The chat-oriented file_search tool uses this retrieval layer so that the assistant can ground responses in the selected content instead of relying only on base-model knowledge.

Files, Folders, And Scope

The platform treats files and folders differently:

  • files are the actual indexed retrieval units
  • folders are a user-facing grouping and selection mechanism

That means selecting a folder in chat is really a convenient way to include the files contained in that folder hierarchy.

This also explains why the Files area and the chat RAG picker are so closely connected.

Reprocess And Delete

The Files area is not only for initial upload.

It also supports later lifecycle actions.

Reprocess

Reprocess reruns indexing for an already stored source file.

This is useful when:

  • indexing previously failed
  • processing settings changed
  • the vector representation needs to be rebuilt

Reprocess does not require the user to upload the file again.

Delete

Delete removes the file from the library lifecycle and triggers cleanup behavior for the indexed content.

From a user perspective, the important point is that delete is not only a UI metadata operation. It also participates in the broader storage and retrieval cleanup path.

Operational Realities

The current implementation has a few practical behaviors that matter to operators and advanced users:

  • indexing is asynchronous
  • upload success does not guarantee retrieval success
  • Temporal, Docling, object storage, and Milvus all need to be available for the full pipeline to complete
  • only /v1/files/upload and /v1/files/upload-stream enter the main RAG indexing path
  • a failed reprocess can temporarily leave a file without active vectors until indexing succeeds again

For day-to-day portal use, the most important takeaway is simple: wait for RAG Available before expecting grounded chat answers from a newly uploaded document.

Why This Matters In CoreAI

This capability is one of the main reasons CoreAI can support grounded assistants and enterprise document workflows.

Without it, chat would only have direct model generation.

With it, the platform can:

  • ground answers in uploaded content
  • scope retrieval to specific files or folders
  • connect document management with conversational AI
  • support more governed and inspectable knowledge workflows

On this page