Components
KubeAI
Inference orchestration component for serving AI workloads in Runtime.
Component Category
Inference / model serving orchestration
Component Description
KubeAI is a Kubernetes-native inference operator for deploying and scaling AI models in production.
Why It Is Used
In BullSequana AI Runtime, KubeAI provides the operational layer that helps run model-serving workloads on Kubernetes with more predictable scaling, routing, and platform integration.
Learn More
Interacts With
MinIO, which provides object storage and dedicated credentials for KubeAI.vLLMandFasterWhisper, which are part of the model-serving runtime KubeAI orchestrates.Model Installer, which targets the KubeAI service endpoint to register and manage models.