Use Local Models via API

How developers should connect to BullSequana AI models from local tools and applications

When developing against BullSequana AI, the preferred integration point is the CoreAI API.

Do not integrate directly with LiteLLM for application development. LiteLLM is an internal platform component and may change over time. The CoreAI API is the stable interface that BullSequana AI exposes to developers.

Recommended Background

This page is primarily for application developers and AI engineers integrating external tools or custom applications with the platform.

Helpful experience includes:

API-based application development
bearer-token or API-key authentication
local development with Docker when reproducing platform-adjacent flows
basic understanding of OpenAI-compatible clients and SDKs

What To Use

Use the API endpoint exposed by your BullSequana AI deployment:

export BSQAI_BASE_URL="https://llm-backend.<platform-domain>/v1"

The llm-backend subdomain stays the same across environments. Only the platform domain changes from one cluster to another.

This API already handles the platform integration behind the scenes:

model routing
authentication
platform policy enforcement
future component evolution behind the API boundary

Authentication Options

The backend supports both of these:

JWT bearer tokens
BullSequana API keys in the sk-bsq-... format

For interactive user access, JWT is a good fit.

For local developer tools, scripts, IDE plugins, and long-lived integrations, API keys are usually the cleaner option when your deployment exposes API key management.

Option 1: Authenticate with JWT

If your platform uses Keycloak-backed interactive access, obtain a JWT and pass it as the bearer token.

Example:

export BSQAI_TOKEN="<jwt-token>"

curl "$BSQAI_BASE_URL/models" \
  -H "Authorization: Bearer $BSQAI_TOKEN"

In local backend development, the sibling coreai-llm-backend repo already includes a helper script:

./scripts/get_jwt_token.sh --print-token

Option 2: Create and use an API key

API keys are created through the CoreAI API itself and are returned only once when created.

First call the API with a valid JWT:

curl -X POST "https://llm-backend.<platform-domain>/v1/api-keys" \
  -H "Authorization: Bearer <jwt-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Local Development"
  }'

Typical response shape:

{
  "id": "a37e6b2e-728e-4ffc-aea4-e5415616a96a",
  "name": "Local Development",
  "key": "sk-bsq-v1-...",
  "masked": "sk-bsq-v1-********-e5b4a3",
  "created_at": "2024-01-15T10:30:00Z"
}

Then use that key as the bearer token:

export BSQAI_API_KEY="sk-bsq-v1-..."

curl "$BSQAI_BASE_URL/models" \
  -H "Authorization: Bearer $BSQAI_API_KEY"

Discover Available Models

Before wiring an application, check which models your deployment exposes:

Available models are also visible in the CoreAI Portal.

If you want the API-level source of truth for the current environment, use:

curl "$BSQAI_BASE_URL/models" \
  -H "Authorization: Bearer $BSQAI_API_KEY"

This is the correct source of truth for model names in your environment.

Use the OpenAI-Compatible API

The CoreAI API exposes OpenAI-compatible endpoints, including:

/v1/models
/v1/chat/completions
/v1/responses

That means many SDKs and tools can be pointed to BullSequana AI with only a base URL and token change.

Chat Completions example

curl -X POST "$BSQAI_BASE_URL/chat/completions" \
  -H "Authorization: Bearer $BSQAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<model-name-from-/v1/models>",
    "messages": [
      { "role": "user", "content": "Hello from BullSequana AI" }
    ]
  }'

Responses API example

from openai import OpenAI

client = OpenAI(
    api_key="sk-bsq-v1-...",
    base_url="https://llm-backend.<platform-domain>/v1",
)

response = client.responses.create(
    model="<model-name-from-/v1/models>",
    input="Explain the deployment model of BullSequana AI."
)

print(response)

IDE And Tooling Integrations

For tools such as Continue, OpenCode, or custom internal applications, point the tool to the CoreAI API, not to LiteLLM directly.

The correct pattern is:

provider type: OpenAI-compatible
base URL: your BullSequana AI CoreAI API
bearer token: JWT or sk-bsq-... API key

Recommended Integration Rule

Use this rule for all developer-facing integrations:

Application or tool -> CoreAI API -> platform components

Not:

Application or tool -> LiteLLM

That keeps the developer contract stable even if the platform team changes the internal inference or proxy layer later.

Use Local Models via API

On this page