Files & RAG
How CoreAI turns uploaded files into retrieval-ready context for chat and assistants.
Files & RAG in CoreAI is the capability chain that turns uploaded documents into grounded context for chat, assistants, and other retrieval-enabled workflows.
At a high level, the flow is:
- a user uploads a file through the platform
- the backend stores the source file and creates a library record
- a background worker chunks and processes the document
- embeddings are generated and stored in
Milvus - chat and assistant flows retrieve matching chunks later through RAG
What This Capability Covers
This capability is broader than file upload alone.
It includes:
- the user-facing file library in the
CoreAI Portal - file metadata and status tracking in the backend
- source-file storage in
S3-compatible object storage such asMinIO - background indexing through
Temporal - document chunking and normalization through
Docling - vector storage and similarity search in
Milvus - retrieval use inside chat and assistant flows
User-Facing Entry Points
The main user-facing entry points are:
- the
Filessection in theCoreAI Portal - the chat
RAGpicker in the portal composer
The Files area is where users upload, organize, rename, delete, and reprocess content.
The chat RAG control is where users decide which files or folders should be used as retrieval context for a conversation.
Core Processing Flow
The backend implementation currently spans multiple runtime layers.
Where The Data Lives
Different parts of the capability live in different systems.
| Data type | Main system |
|---|---|
| Original uploaded file | S3-compatible storage such as MinIO |
| File library metadata and status | PostgreSQL |
| Chunk JSON and derived artifacts | {bucket}-artifacts object-storage bucket |
| Searchable vectors and retrieval payloads | Milvus |
This boundary matters.
The file row shown in the portal is not the retrieval index itself. The portal library is driven by database metadata, while actual retrieval depends on vectorized chunks in Milvus.
Status Lifecycle
Files move through a visible RAG lifecycle.
rag_pending: upload succeeded and background indexing is queuedrag_processing: the worker is actively chunking, embedding, and indexing the filerag_available: the file is indexed and can be used for retrievalrag_failed: indexing failedrag_removed: the file was removed from the retrieval lifecycle
This is why upload success is not the same as retrieval readiness.
A file can appear in the portal library before it is usable in grounded chat.
How Uploads Become Retrieval-Ready
The backend upload path currently uses:
POST /v1/files/uploadPOST /v1/files/upload-stream
These endpoints do more than store bytes.
They also:
- compute a deterministic object key
- create or update the corresponding library record
- set the initial status to
rag_pending - start a
Temporalworkflow onfile-events-queue
The actual indexing work does not happen inside the upload request itself.
Instead, a separate file-event worker performs the expensive background steps later. If that worker is not running, files can remain stuck in rag_pending even though the upload returned success.
Docling, Chunking, And Embeddings
For standard document formats, the worker uses Docling-based processing to chunk and normalize the source material.
That processing can also produce additional artifacts such as chunk JSON and, where applicable, metadata, images, or tables.
After chunking, embeddings are generated for the chunks and inserted into Milvus.
Those vectors are what later power retrieval.
How Chat Uses Files Later
The portal chat composer has a RAG control that searches the file library and lets users pick:
- individual files
- folders
- all available retrieval-ready content
Only files with rag_available status are surfaced as directly selectable retrieval files in that picker.
When a user submits a chat request with retrieval enabled, the UI sends:
fileSearch- optional
documentIds - optional
folderIds
The backend then resolves the effective retrieval scope.
Folder selections are expanded on the backend into the underlying file IDs for that user, and the final retrieval step uses those document IDs when running similarity search.
This makes the flow user-friendly in the portal while still keeping retrieval scoped to concrete indexed documents under the hood.
Retrieval Boundary
The retrieval path is currently centered around the backend RetrievalService.
In practice, it:
- searches a configured
Milvuscollection - optionally filters by document IDs
- can apply a user-specific filter
- returns chunk text that is assembled into model context
The chat-oriented file_search tool uses this retrieval layer so that the assistant can ground responses in the selected content instead of relying only on base-model knowledge.
Files, Folders, And Scope
The platform treats files and folders differently:
- files are the actual indexed retrieval units
- folders are a user-facing grouping and selection mechanism
That means selecting a folder in chat is really a convenient way to include the files contained in that folder hierarchy.
This also explains why the Files area and the chat RAG picker are so closely connected.
Reprocess And Delete
The Files area is not only for initial upload.
It also supports later lifecycle actions.
Reprocess
Reprocess reruns indexing for an already stored source file.
This is useful when:
- indexing previously failed
- processing settings changed
- the vector representation needs to be rebuilt
Reprocess does not require the user to upload the file again.
Delete
Delete removes the file from the library lifecycle and triggers cleanup behavior for the indexed content.
From a user perspective, the important point is that delete is not only a UI metadata operation. It also participates in the broader storage and retrieval cleanup path.
Operational Realities
The current implementation has a few practical behaviors that matter to operators and advanced users:
- indexing is asynchronous
- upload success does not guarantee retrieval success
Temporal,Docling, object storage, andMilvusall need to be available for the full pipeline to complete- only
/v1/files/uploadand/v1/files/upload-streamenter the main RAG indexing path - a failed reprocess can temporarily leave a file without active vectors until indexing succeeds again
For day-to-day portal use, the most important takeaway is simple: wait for RAG Available before expecting grounded chat answers from a newly uploaded document.
Why This Matters In CoreAI
This capability is one of the main reasons CoreAI can support grounded assistants and enterprise document workflows.
Without it, chat would only have direct model generation.
With it, the platform can:
- ground answers in uploaded content
- scope retrieval to specific files or folders
- connect document management with conversational AI
- support more governed and inspectable knowledge workflows