# Files & RAG (/docs/coreai/files-rag)



`Files & RAG` in CoreAI is the capability chain that turns uploaded documents into grounded context for chat, assistants, and other retrieval-enabled workflows.

At a high level, the flow is:

1. a user uploads a file through the platform
2. the backend stores the source file and creates a library record
3. a background worker chunks and processes the document
4. embeddings are generated and stored in `Milvus`
5. chat and assistant flows retrieve matching chunks later through RAG

What This Capability Covers [#what-this-capability-covers]

This capability is broader than file upload alone.

It includes:

* the user-facing file library in the `CoreAI Portal`
* file metadata and status tracking in the backend
* source-file storage in `S3`-compatible object storage such as `MinIO`
* background indexing through `Temporal`
* document chunking and normalization through `Docling`
* vector storage and similarity search in `Milvus`
* retrieval use inside chat and assistant flows

User-Facing Entry Points [#user-facing-entry-points]

The main user-facing entry points are:

* the `Files` section in the `CoreAI Portal`
* the chat `RAG` picker in the portal composer

The Files area is where users upload, organize, rename, delete, and reprocess content.

The chat `RAG` control is where users decide which files or folders should be used as retrieval context for a conversation.

Core Processing Flow [#core-processing-flow]

The backend implementation currently spans multiple runtime layers.

<Mermaid
  chart="flowchart TD
    A[Portal Files UI] --> B[`/v1/files/upload` or `/v1/files/upload-stream`]
    B --> C[Store source file in S3-compatible object storage]
    C --> D[Create `library_files` row in Postgres with `rag_pending`]
    D --> E[Start Temporal workflow on `file-events-queue`]

    E --> F[file-event worker]
    F --> G[Set status to `rag_processing`]
    G --> H[Remove older vectors for the same source]
    H --> I[Download source file]
    I --> J[Chunk and normalize document]
    J --> K[Write chunk and artifact files to the artifacts bucket]
    K --> L[Generate embeddings and insert them into Milvus]
    L --> M[Set status to `rag_available`]

    M --> N[Later chat retrieval]
    N --> O[Use selected file or folder scope]
    O --> P[Resolve target document IDs]
    P --> Q[Run similarity search in Milvus]
    Q --> R[Pass matching chunks to the model as grounded context]"
/>

Where The Data Lives [#where-the-data-lives]

Different parts of the capability live in different systems.

| Data type                                 | Main system                                |
| ----------------------------------------- | ------------------------------------------ |
| Original uploaded file                    | `S3`-compatible storage such as `MinIO`    |
| File library metadata and status          | `PostgreSQL`                               |
| Chunk JSON and derived artifacts          | `{bucket}-artifacts` object-storage bucket |
| Searchable vectors and retrieval payloads | `Milvus`                                   |

This boundary matters.

The file row shown in the portal is not the retrieval index itself. The portal library is driven by database metadata, while actual retrieval depends on vectorized chunks in `Milvus`.

Status Lifecycle [#status-lifecycle]

Files move through a visible RAG lifecycle.

* `rag_pending`: upload succeeded and background indexing is queued
* `rag_processing`: the worker is actively chunking, embedding, and indexing the file
* `rag_available`: the file is indexed and can be used for retrieval
* `rag_failed`: indexing failed
* `rag_removed`: the file was removed from the retrieval lifecycle

This is why upload success is not the same as retrieval readiness.

A file can appear in the portal library before it is usable in grounded chat.

How Uploads Become Retrieval-Ready [#how-uploads-become-retrieval-ready]

The backend upload path currently uses:

* `POST /v1/files/upload`
* `POST /v1/files/upload-stream`

These endpoints do more than store bytes.

They also:

* compute a deterministic object key
* create or update the corresponding library record
* set the initial status to `rag_pending`
* start a `Temporal` workflow on `file-events-queue`

The actual indexing work does not happen inside the upload request itself.

Instead, a separate file-event worker performs the expensive background steps later. If that worker is not running, files can remain stuck in `rag_pending` even though the upload returned success.

Docling, Chunking, And Embeddings [#docling-chunking-and-embeddings]

For standard document formats, the worker uses `Docling`-based processing to chunk and normalize the source material.

That processing can also produce additional artifacts such as chunk JSON and, where applicable, metadata, images, or tables.

After chunking, embeddings are generated for the chunks and inserted into `Milvus`.

Those vectors are what later power retrieval.

How Chat Uses Files Later [#how-chat-uses-files-later]

The portal chat composer has a `RAG` control that searches the file library and lets users pick:

* individual files
* folders
* all available retrieval-ready content

Only files with `rag_available` status are surfaced as directly selectable retrieval files in that picker.

When a user submits a chat request with retrieval enabled, the UI sends:

* `fileSearch`
* optional `documentIds`
* optional `folderIds`

The backend then resolves the effective retrieval scope.

Folder selections are expanded on the backend into the underlying file IDs for that user, and the final retrieval step uses those document IDs when running similarity search.

This makes the flow user-friendly in the portal while still keeping retrieval scoped to concrete indexed documents under the hood.

Retrieval Boundary [#retrieval-boundary]

The retrieval path is currently centered around the backend `RetrievalService`.

In practice, it:

* searches a configured `Milvus` collection
* optionally filters by document IDs
* can apply a user-specific filter
* returns chunk text that is assembled into model context

The chat-oriented `file_search` tool uses this retrieval layer so that the assistant can ground responses in the selected content instead of relying only on base-model knowledge.

Files, Folders, And Scope [#files-folders-and-scope]

The platform treats files and folders differently:

* files are the actual indexed retrieval units
* folders are a user-facing grouping and selection mechanism

That means selecting a folder in chat is really a convenient way to include the files contained in that folder hierarchy.

This also explains why the Files area and the chat `RAG` picker are so closely connected.

Reprocess And Delete [#reprocess-and-delete]

The Files area is not only for initial upload.

It also supports later lifecycle actions.

Reprocess [#reprocess]

Reprocess reruns indexing for an already stored source file.

This is useful when:

* indexing previously failed
* processing settings changed
* the vector representation needs to be rebuilt

Reprocess does not require the user to upload the file again.

Delete [#delete]

Delete removes the file from the library lifecycle and triggers cleanup behavior for the indexed content.

From a user perspective, the important point is that delete is not only a UI metadata operation. It also participates in the broader storage and retrieval cleanup path.

Operational Realities [#operational-realities]

The current implementation has a few practical behaviors that matter to operators and advanced users:

* indexing is asynchronous
* upload success does not guarantee retrieval success
* `Temporal`, `Docling`, object storage, and `Milvus` all need to be available for the full pipeline to complete
* only `/v1/files/upload` and `/v1/files/upload-stream` enter the main RAG indexing path
* a failed reprocess can temporarily leave a file without active vectors until indexing succeeds again

For day-to-day portal use, the most important takeaway is simple: wait for `RAG Available` before expecting grounded chat answers from a newly uploaded document.

Why This Matters In CoreAI [#why-this-matters-in-coreai]

This capability is one of the main reasons CoreAI can support grounded assistants and enterprise document workflows.

Without it, chat would only have direct model generation.

With it, the platform can:

* ground answers in uploaded content
* scope retrieval to specific files or folders
* connect document management with conversational AI
* support more governed and inspectable knowledge workflows

Related Pages [#related-pages]

* [CoreAI Portal Guide](/docs/guides/coreai-portal)
* [Milvus](/docs/coreai/components/milvus)
* [Docling](/docs/coreai/components/docling)
* [CoreAI API](/docs/coreai/components/coreai-api)
