CoreAI Portal Guide
User guide for the modern CoreAI Portal experience.
The CoreAI Portal is the main user interface of BullSequana AI. It gives business users, AI engineers, and platform teams a single web surface for working with models, conversations, files, assistants, and platform-level configuration.
Access to features depends on the roles and permissions granted in your deployment.
Workspace Overview
After authentication, the portal switches into a sidebar-based workspace that acts as the main navigation shell for the modern SPA experience.

From the sidebar, users can move between the main working areas such as:
New ChatFilesCustom GPTsModelsSettings
The same sidebar also includes search over chat history, account actions, and workspace-level controls.
This matters because most of the portal features described below are not isolated pages. They are part of one shared workspace where users move between conversation, retrieval, model, and settings tasks without leaving the application shell.
Homepage
The homepage is the main sign-in entry point for the modern portal experience.

From this page, users can start the standard login flow and enter the platform through the shared SSO experience.
The Platform Services panel on the right presents the services exposed in the current environment, such as documentation, model operations, observability, storage, workflow, and administration tools.
This service list is environment-specific and backend-driven, so the exact entries can differ from one deployment to another.
In practice, this page acts as a service directory for the platform: once users authenticate through SSO, they can access the services they are entitled to use, subject to the access rights and authorization rules configured in the deployment.
Authentication
The portal uses the platform-provided SSO flow for authentication.

When a user starts the login flow, the portal redirects through the platform backend to Keycloak, which handles the authentication handoff and returns the user to the portal after a successful login.
In production environments, this SSO experience is typically federated with the organization's actual workplace identity provider.
That means users usually sign in with the same enterprise identity they already use elsewhere, while BullSequana AI applies its own platform access rules on top.
In development or demo environments, federation may not be configured yet. In those cases, teams often use local users created directly in Keycloak for testing and evaluation.
See Development And Demo Clusters for the Keycloak local-user pattern used in those environments.
What the Portal Is For
Use the portal when you want to:
- chat with available AI models
- upload and manage files
- check which models are available in your environment
- manage reusable assistants
- configure selected AI and platform settings
Languages and Appearance
The modern portal also includes built-in language and theme controls from the sidebar menu.

Languages
The current portal locale switcher supports these languages in the application shell:
EnglishFrançaisDeutschSvenska
Changing the language updates the active locale route and reloads the portal in the selected language.
Light and dark mode
The same sidebar menu exposes theme choices for:
LightDarkSystem
The portal uses next-themes, so users can either force a specific theme or follow the system preference of their device.
Chat
The chat workspace is the main day-to-day entry point for most users.

This is where users start conversations, switch between available models, and work with the retrieval and assistant features exposed in their environment.
Composer tools
The chat composer includes several important controls:
- the
+action menu for adding files to the conversation - a
Searchtoggle for web search - a
RAGpicker for selecting retrieval-ready files and folders - a model selector for choosing from available chat models and
Custom GPTs
The RAG control is connected to files that are already available for retrieval in the platform. This lets users scope a prompt to selected files or folders instead of relying only on the base model.
The model selector is environment-driven. It lists available chat models and also surfaces configured Custom GPTs, with recent selections shown first for convenience.
If a model you expect is missing, check the Models area in the portal or see Models for the platform-side model management view.
What users can do from chat
From the chat workspace, users can typically:
- start a new conversation with an available model
- attach supporting files to a message
- use retrieval over indexed content through
RAG - enable web search when the deployment exposes it
- switch between raw models and higher-level
Custom GPTs - revisit earlier conversations from the sidebar
Streaming and response experience
Responses stream progressively into the conversation instead of appearing only at the end.
Depending on the selected model or assistant, the UI can also show:
- source links for grounded answers
- reasoning sections when the backend exposes them
- tool and assistant activity blocks for intermediate steps
- copy actions for the latest assistant response
This makes the chat view useful not only for end-user prompting, but also for understanding how an assistant or tool-enabled workflow arrived at its answer.
Tool calls, web search, and intermediate steps

When web search or other tool-enabled behavior is active, the conversation can surface intermediate steps directly in the message flow.
That can include:
- tool names such as
web_search - the parameters sent to the tool
- structured tool results
- the final assistant answer generated from those results
This makes it easier to see when the assistant is relying on external tools instead of only generating a direct model completion.
In more agent-like flows, the same conversation area can also expose reasoning traces and other intermediate assistant activity when the backend provides them.
These elements help users distinguish between:
- the final assistant answer
- the intermediate actions the system took while producing it
This is especially useful for debugging assistant behavior, validating retrieval or tool use, and understanding why a response took longer than a simple direct completion.
Files
The files area is the document and retrieval workspace of the portal.

This is where users upload content, organize it into folders, and track whether a file is ready to be used in retrieval-augmented chat.
What users can do
From the Files area, users can typically:
- upload files into the current folder
- create folders and nested folder structures
- browse and search through their library
- rename or delete files and folders
- move files and folders by drag and drop
- retry indexing for files that failed RAG processing
The page also exposes a live upload indicator so users can see when files are still being transferred in the background.
Uploading files

Uploading is done from the current folder context.
Users can open the upload panel from the main Upload action or from a folder-specific upload action, then drop files into that target location.
This makes it easy to keep retrieval content organized from the start instead of uploading everything into a flat top-level library.
File status and RAG readiness
Each file carries a status that reflects whether it is ready for retrieval.
The important user-visible states are:
RAG PendingRAG ProcessingRAG AvailableRAG FailedRAG Removed
This matters because a file can exist in the library before it is actually searchable in chat.
Only retrieval-ready files can be used through the chat RAG picker. In practice, users should wait for RAG Available before expecting grounded answers from that content.
Relationship to chat
The Files area is tightly connected to the chat experience.
- files uploaded here become the content source for retrieval
- folders created here can be selected later in the chat
RAGcontrol - failed files can be reprocessed here before trying retrieval again
For a deeper platform-level explanation of how uploads become retrieval-ready data, see Files & RAG.
Models
The Models area is the main operational workspace for model visibility, repository management, model import, and inference deployment.
In the current portal, this area brings together three distinct concerns:
Overviewfor models already exposed in the platformRepositoryforMLflow-registered models and versionsDeploymentfor guided model onboarding and serving setup

This part of the portal sits on top of multiple backend services:
LiteLLMfor the user-facing model catalog and routing layerMLflowfor repository-backed model artifacts and version trackingModel Installerfor deployment and import operations
Overview
The overview tab is the quickest way to inspect what model capacity is available in the current environment.
From here, users can:
- browse the currently exposed models
- search by name, provider, feature, or tag
- filter between
on-premisesandcloudmodels - start a deployment flow from the
Deploy Modelaction
This view is especially useful when users need to confirm which models are actually exposed through the platform before using them in chat, applications, or Custom GPTs.
Repository

The repository tab is the MLflow-backed model registry view.
It is used for models that have already been imported into the platform repository rather than only being referenced by a live serving URL.
From this area, users can:
- browse registered models
- inspect the latest version and status
- open model details
- deploy a repository-backed model into inference
- delete repository entries if they have the required permissions
At the model-detail level, the portal can resolve the latest artifact URI and pass it into the deployment flow. That makes the repository the cleanest path when a model is already present in MLflow and the next step is only to serve it.
Deployment

The deployment tab is the guided entry point for making models available for inference.
It offers:
Easy SetupAdvanced Setup- config upload support
This deployment path ultimately bridges to the Model Installer service.
When the user submits a deployment, the portal assembles the deployment payload and calls the backend installer endpoint that registers the model for inference.
Easy setup and advanced setup

The portal supports two main deployment styles.
Easy Setup is preset-driven.
It:
- lets users browse curated model presets by category
- pre-fills engine, resource profile, features, limits, and tags
- defaults to a simpler installation flow
- allows switching to
Customize Deploymentwhen more control is needed
Advanced Setup is closer to the raw deployment contract.
It exposes a fuller deployment form with:
- model source URL or repository-backed source
- deployment name and namespace
- model mode and features
- resource profile and instance count
- scaling and timeout settings
- environment variables and extra arguments
- a YAML preview of the resulting
kubeai.org/v1Modelresource
This is the right path when teams need full control over how the model is served.
Downloader and repository import

The downloader flow is for importing Hugging Face models into the local repository.
The portal collects:
- the Hugging Face model name
- the revision
- the target
MLflowexperiment name - the artifact path, which defaults to
model
When submitted, the portal calls the Model Installer download endpoint and then monitors the import asynchronously.
The current flow:
- starts the import through
download_hf_model - polls
MLflowuntil the model becomesREADY - checks Model Installer logs to detect failures
- redirects back to the models area when the import completes
An important practical behavior is that this flow imports into MLflow first. It does not automatically deploy the model for inference.
After the model is ready in the repository, teams typically continue with one of these paths:
- deploy from the repository detail page
- open advanced setup with the repository artifact pre-filled
Permissions and operator workflows
Model management actions are permission-sensitive.
In practice, actions such as deployment, download, and delete are gated for users with model-management rights.
So while many users may be able to inspect available models, only authorized users should expect to manage the model lifecycle.
Practical mental model
The simplest way to think about the Models area is:
Overview= what is already available to useRepository= what has been imported and versioned inMLflowDeployment= how a model becomes actively served in the platform
For the platform-level explanation behind these flows, see Models and Model Installer API.
Custom GPTs
The portal includes a Custom GPTs area for creating reusable expert-style assistants on top of the available chat models.

In practice, a Custom GPT is a saved agent configuration owned by the current user.
Each one stores:
- a name
- the selected chat model
- a short description
- a system prompt that defines the expert's behavior
This is useful when users want a repeatable assistant experience rather than starting from a blank chat every time.
Creating a GPT expert

Creating a Custom GPT is a lightweight configuration flow.
Users define:
GPT NameAI ModelDescriptionSystem Prompt
The system prompt is the most important part. It is what turns a general-purpose model into a more specialized expert persona or task-oriented assistant.
Examples include:
- a marketing expert
- a technical documentation assistant
- a support triage assistant
- a domain-specific analyst
The portal loads the available model list from the backend, so the model picker reflects the models that are actually accessible in the current environment.
How it works at runtime
When a Custom GPT is created, the backend stores it as an agent configuration tied to the current user.
At chat time, the portal can send either:
- a raw model name
- or the UUID of a saved Custom GPT
If a Custom GPT is selected, the backend resolves that saved configuration and replaces the chat request with:
- the configured
model_name - the configured
system_prompt
That means the Custom GPT is not a separate model deployment. It is a reusable configuration layer on top of an existing model.
Using a GPT expert in chat

Custom GPTs appear directly in the chat model selector together with the available chat models.
Once selected, the GPT expert becomes the active assistant for that conversation.
The user still keeps the normal chat controls around it, including:
- file upload
RAG- web search when available
- streaming responses
So the Custom GPT defines the assistant's default behavior, while the rest of the chat experience still controls how the conversation is grounded and enriched.
Search and reuse
The Custom GPTs page lets users search both by GPT name and by model.
In chat, recently used models and Custom GPTs are also surfaced first, making it easier to switch back to a frequently used expert.
Practical mental model
The simplest way to think about Custom GPTs is:
model= raw model capabilityCustom GPT= saved model + instructions for a specific expert role
If you need the platform-side view of which models exist and how they are configured, see Models.
Settings
The settings area is aimed more at administrators, AI engineers, and platform operators than general end users.
The settings areas covered in this guide are:
Authorization Controlfor users, groups, and access structureService Desksettings for service-desk-oriented AI behaviorModel Presetsfor reusable model configuration patternsVector Storefor collection management and embedding-backed knowledge data
Some deployments may also expose additional settings such as Global Configuration, or may hide certain sections depending on release level and permissions.
Authorization Control
Authorization Control is the settings area where administrators inspect users and groups and manage the role assignments that drive platform permissions.

The portal exposes two main views here:
GroupsUsers
Those views are populated from backend authorization endpoints that return Keycloak-backed identities enriched with assigned and computed roles.
What the UI shows
The settings UI fetches:
- the group tree from
/v1/authz/groups - the organization user list from
/v1/authz/users
In practice, this means the portal is not showing only raw identity-provider data. It is showing identity data combined with the authorization state computed from the platform's role model.
For users, the portal can show:
- basic identity details
- group memberships
- directly assigned roles
- inherited roles coming from group membership or role hierarchy
For groups, the portal can show:
- the nested group tree
- directly assigned roles for a group
- inherited roles from parent groups

The important distinction is:
assigned rolesare written explicitlyinherited rolesare computed and read-only from the user's perspective in this screen
In the user detail sheet, this becomes very visible: the platform can show a small set of effective inherited roles even when those capabilities are coming from a much larger set of underlying group memberships.
How OpenFGA is used in CoreAI
The actual authorization model is implemented in coreai-llm-backend with OpenFGA.
At schema level, the platform defines:
- direct role assignments such as
assigned_admin,assigned_ai_engineer,assigned_developer, andassigned_ui_user - computed roles such as
admin,ai_engineer,developer, andui_user - computed permissions such as
can_access_api,can_manage_models,can_edit_config, andcan_manage_roles
The implementation deliberately checks permissions in application code instead of hardcoding role names for every action.
That means the code can ask questions like:
can_manage_modelscan_edit_configcan_manage_roles
instead of coupling every feature directly to one specific role name.
Role hierarchy and capability inheritance
The current OpenFGA model defines a role hierarchy and a permission hierarchy.
Examples from the live schema:
admincomes fromassigned_adminai_engineercomes fromassigned_ai_engineerand also fromadmindevelopercomes fromassigned_developer,assigned_ui_user, and also fromai_engineerui_useris computed fromdeveloper
And permissions are then derived from those roles:
can_access_apicomes fromdevelopercan_manage_modelscomes fromai_engineercan_edit_configcomes fromadmincan_manage_rolescomes fromadmin
This is why a user may be allowed to do something even if that capability was not assigned directly to them as a standalone permission. It may be inherited through role relationships.
How group membership affects access
Group membership is synchronized from Keycloak into OpenFGA by a dedicated sync worker.
That worker:
- reads the Keycloak group tree
- writes user-to-group membership tuples into OpenFGA
- writes subgroup nesting tuples into OpenFGA
- can ensure organization roles for selected groups
- caches the group tree for the authorization endpoints used by the portal
This is a key part of how authorization stays aligned with the identity-provider structure.
In practice, if a role is assigned to a group, the members of that group inherit the effective role and resulting permissions through the OpenFGA graph.
How the portal checks permissions
The portal uses permission checks in two ways.
First, the Authorization Control screen itself gates role-management actions with can_manage_roles.
Second, the rest of the portal uses the same permission model to enable, disable, or hide sensitive actions.
Examples in the current portal include:
- model deployment, download, and delete actions gated by
can_manage_models - global configuration actions gated by
can_edit_config - role assignment and revocation in Authorization Control gated by
can_manage_roles
On the frontend, the portal calls /v1/authz/check-permission and /v1/authz/check-role through permission hooks and AuthorizationActionGate wrappers.
If the backend returns 403, the UI treats that as a denied capability and disables or blocks the action accordingly.
Backend enforcement examples
The authorization model is not only cosmetic in the UI.
There are also server-side permission checks in coreai-llm-backend.
Examples include:
/v1/authz/assign-rolerequiringcan_manage_roles/v1/authz/assign-group-rolerequiringcan_manage_roles/v1/authz/revoke-rolerequiringcan_manage_roles/v1/configurationpatch and reset operations requiringcan_edit_config
So the real flow is:
- identity comes from
Keycloak - membership and role tuples are synchronized into
OpenFGA - roles and permissions are computed from the authorization model
- the portal checks those permissions before exposing actions
- backend endpoints can also reject unauthorized actions with
403
Practical mental model
The simplest way to think about Authorization Control is:
Keycloakis the identity sourceOpenFGAis the authorization graph and policy engine- the portal is the management and inspection UI for that graph as exposed through backend APIs
So whether a user can deploy a model, edit shared settings, or manage role assignments depends on the effective permissions produced by that graph, not just on whether the user exists in the identity provider.
Service Desk
The Service Desk settings area configures the default AI behavior for the portal's Service Desk chat experience.

This is an operator-facing configuration screen that answers three runtime questions:
- which chat model should Service Desk use by default
- which retrieval collection should Service Desk use, if any
- whether the Service Desk chat should be enabled at all
What gets configured
The main Service Desk form lets operators choose:
- a default chat model
- a collection to use for retrieval grounding

The available model list is pulled from the portal's chat-model catalog, which comes from fetchChatModels().
The available collection list is pulled from the collection API used for the vector-store workflow.
That means the Service Desk feature is not introducing a separate model-management system of its own. It reuses:
- the existing chat model catalog
- the existing vector-store collection inventory
Enabling the feature
The enable toggle is intentionally dependent on configuration completeness.
If no default model and no collection are configured yet, the Service Desk toggle stays disabled and the UI explains why.
Once both are set and saved, the operator can enable the feature for users.
This is a practical safeguard: it prevents the portal from exposing a Service Desk chat entry point that has no valid runtime configuration behind it.
Editing and lifecycle
After a Service Desk configuration exists, the settings page switches into a management view that shows the current default model and current collection and allows them to be edited or removed.

From there, operators can:
- review the active model and collection
- edit the current defaults
- delete the configuration
- toggle the feature on or off
Where the settings are stored
Unlike the OpenFGA-backed authorization model, the Service Desk settings are currently persisted by portal server actions in the portal database.
The stored record includes:
- the selected model name
- the selected collection name
- the enabled flag
In the current implementation, this is managed through the serviceDeskModelSettings record handled by portal server actions such as:
getServiceDeskModelupdateServiceDeskModelupdateServiceDeskEnableddeleteServiceDeskModel
What happens at runtime
At runtime, the portal layout loads the Service Desk settings and mounts the Service Desk chat only when the feature is enabled.
When it is enabled:
- the Service Desk chat receives the configured default model
- it receives the configured collection name
- it automatically enables retrieval mode when a collection is present
The chat component then sends requests using:
- the selected model
useRag: truewhen the Service Desk collection is configuredcollection: <configured-collection>for retrieval-backed answers
So the Service Desk settings page directly controls how the Service Desk chat behaves for end users.
Practical mental model
The simplest way to think about Service Desk is:
Modelsprovides the default LLM choiceVector Storeprovides the knowledge collectionService Deskbinds those together into a dedicated support-oriented chat experience
This makes Service Desk a small orchestration layer on top of the existing model and retrieval capabilities rather than a separate AI subsystem.
Model Presets
Model Presets are one of the most important operator-facing settings because they directly feed the guided model deployment flow described in the Models section above.

In practice, a model preset is a reusable deployment template stored in the portal database.
The portal uses active presets to:
- populate the
Easy Setupcatalog - group presets by category such as
llm,embedding,audio, andmultimodal - pre-fill deployment defaults when a user chooses a preset
That means model presets are one of the main bridges between platform administration and day-to-day model onboarding.
Preset management view
The settings page lets operators:
- browse all presets
- search by preset name
- filter by category
- create new presets
- edit existing presets
- delete presets
This is the control plane for the curated deployment experience exposed later in Models -> Deployment -> Easy Setup.
Creating a preset

Creating a preset opens a dedicated configuration form where operators define both the model identity and the deployment defaults that should be reused later.

The preset form includes:
- model URL and model name
- category and description
- engine
- model mode
- default features and tags
- resource profile, instances, and replica settings
- token and vector-dimension settings where relevant
- environment variables and arguments
The form also loads active resource profiles so preset defaults stay aligned with deployable infrastructure options.
Why presets matter
Without presets, every deployment would start from a much more manual configuration flow.
With presets, operators can standardize:
- which model variants are offered to users
- which engine and mode should be used
- what default resource profile and scaling values apply
- what tags, features, and limits should be pre-populated
This is why the Easy Setup model catalog can present a curated list of deployable options instead of exposing only a blank advanced form.
Relationship to Models
The relationship is straightforward:
Settings -> Model Presetsdefines curated deployment templatesModels -> Deployment -> Easy Setupconsumes those templatesModels -> Advanced Setupremains available when teams need to override or bypass preset-driven defaults
So when you update a preset here, you are effectively shaping the guided deployment experience available elsewhere in the portal.
Vector Store
The Vector Store settings area is where operators manage the retrieval collections used by CoreAI RAG workflows.

At portal level, this is the main UI for:
- listing available collections
- creating new collections
- inspecting collection size and vector dimensions
- searching documents already indexed into a collection
- deleting documents or whole collections
Under the hood, this section talks to the backend collection APIs implemented in coreai-llm-backend.
What a collection represents
A collection is the retrieval boundary used by the vector store.
In practice, a collection combines:
- stored chunk embeddings in
Milvus - the embedding model name used for that collection
- collection schema metadata
- the document chunks that become searchable through RAG
This means the collection is not just a folder-like label. It is the actual vector-search target used later by retrieval workflows.
Creating a collection

When creating a collection, the portal asks for:
- collection name
- embedding model
- optional description
The portal fetches available embedding models and can preselect the default embedding model when one is marked as default.
When the collection is created, the backend stores the chosen embedding model in the collection schema itself, together with the vector dimension.
That backend schema always includes core fields such as:
document_idembeddingembedding_modelpage_contentmetadata
The backend also creates the vector index with a predefined Milvus configuration, including HNSW indexing and COSINE similarity by default.
Inspecting and searching a collection

Opening a collection lets operators:
- inspect vector dimension and document count
- browse indexed documents
- run similarity search inside that collection
- delete individual documents
- delete the whole collection
This makes the Vector Store page both a management UI and a practical debugging surface for retrieval quality.
If users are not getting the expected grounded answers, this is one of the first places to check whether the right collection exists and whether the expected documents are actually present in it.
Why the embedding model matters
One important implementation detail is that the embedding model is attached to the collection itself.
In coreai-llm-backend, document processing and vector insertion later call get_embedding_model_from_collection logic to resolve the embedding model from the target collection.
That means:
- users choose the embedding model when creating the collection
- Files and RAG processing do not need users to choose that embedder again for every upload
- the ingestion path automatically uses the embedding model already defined by the collection
This is the reason the Files/RAG pipeline can stay simpler for users while still remaining consistent with the vector-store configuration.
Default collection
The backend document populator currently defines default_collection as its default Milvus target.
That is an important operational convention.
If teams want a standard out-of-the-box retrieval destination, creating a collection named default_collection is the natural baseline choice.
This is especially useful for environments where:
- a shared default RAG target is expected
- Service Desk or other retrieval-backed features need a predictable collection name
- operators want a conventional collection available before more specialized collections are introduced
Relationship to Files & RAG
The connection to the Files and RAG pipeline is direct:
- Vector Store defines the target collection and embedding model
- Files ingestion writes chunks into a collection
- chat and retrieval flows later search that collection
So if you think about the flow end to end:
Vector Storedefines where embeddings liveFilesprovides the source documentsChatandService Deskconsume the resulting retrieval context
For the platform-level pipeline details, see Files & RAG.
Practical Tips
- Use the portal as the fastest way to see which models are currently available to you.
- Use the
Modelsarea before copying model names into developer tools or applications. - Use settings and administrative sections only if your role is meant to manage shared platform behavior.
- If a feature is missing, first check whether it is disabled by role, release level, or deployment configuration.