Use Local Models via API
How developers should connect to BullSequana AI models from local tools and applications
When developing against BullSequana AI, the preferred integration point is the CoreAI API.
Do not integrate directly with LiteLLM for application development. LiteLLM is an internal platform component and may change over time. The CoreAI API is the stable interface that BullSequana AI exposes to developers.
Recommended Background
This page is primarily for application developers and AI engineers integrating external tools or custom applications with the platform.
Helpful experience includes:
- API-based application development
- bearer-token or API-key authentication
- local development with Docker when reproducing platform-adjacent flows
- basic understanding of OpenAI-compatible clients and SDKs
What To Use
Use the API endpoint exposed by your BullSequana AI deployment:
export BSQAI_BASE_URL="https://llm-backend.<platform-domain>/v1"The llm-backend subdomain stays the same across environments. Only the platform domain changes from one cluster to another.
This API already handles the platform integration behind the scenes:
- model routing
- authentication
- platform policy enforcement
- future component evolution behind the API boundary
Authentication Options
The backend supports both of these:
JWT bearer tokensBullSequana API keysin thesk-bsq-...format
For interactive user access, JWT is a good fit.
For local developer tools, scripts, IDE plugins, and long-lived integrations, API keys are usually the cleaner option when your deployment exposes API key management.
Option 1: Authenticate with JWT
If your platform uses Keycloak-backed interactive access, obtain a JWT and pass it as the bearer token.
Example:
export BSQAI_TOKEN="<jwt-token>"
curl "$BSQAI_BASE_URL/models" \
-H "Authorization: Bearer $BSQAI_TOKEN"In local backend development, the sibling coreai-llm-backend repo already includes a helper script:
./scripts/get_jwt_token.sh --print-tokenOption 2: Create and use an API key
API keys are created through the CoreAI API itself and are returned only once when created.
First call the API with a valid JWT:
curl -X POST "https://llm-backend.<platform-domain>/v1/api-keys" \
-H "Authorization: Bearer <jwt-token>" \
-H "Content-Type: application/json" \
-d '{
"name": "Local Development"
}'Typical response shape:
{
"id": "a37e6b2e-728e-4ffc-aea4-e5415616a96a",
"name": "Local Development",
"key": "sk-bsq-v1-...",
"masked": "sk-bsq-v1-********-e5b4a3",
"created_at": "2024-01-15T10:30:00Z"
}Then use that key as the bearer token:
export BSQAI_API_KEY="sk-bsq-v1-..."
curl "$BSQAI_BASE_URL/models" \
-H "Authorization: Bearer $BSQAI_API_KEY"Discover Available Models
Before wiring an application, check which models your deployment exposes:
Available models are also visible in the CoreAI Portal.
If you want the API-level source of truth for the current environment, use:
curl "$BSQAI_BASE_URL/models" \
-H "Authorization: Bearer $BSQAI_API_KEY"This is the correct source of truth for model names in your environment.
Use the OpenAI-Compatible API
The CoreAI API exposes OpenAI-compatible endpoints, including:
/v1/models/v1/chat/completions/v1/responses
That means many SDKs and tools can be pointed to BullSequana AI with only a base URL and token change.
Chat Completions example
curl -X POST "$BSQAI_BASE_URL/chat/completions" \
-H "Authorization: Bearer $BSQAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "<model-name-from-/v1/models>",
"messages": [
{ "role": "user", "content": "Hello from BullSequana AI" }
]
}'Responses API example
from openai import OpenAI
client = OpenAI(
api_key="sk-bsq-v1-...",
base_url="https://llm-backend.<platform-domain>/v1",
)
response = client.responses.create(
model="<model-name-from-/v1/models>",
input="Explain the deployment model of BullSequana AI."
)
print(response)IDE And Tooling Integrations
For tools such as Continue, OpenCode, or custom internal applications, point the tool to the CoreAI API, not to LiteLLM directly.
The correct pattern is:
- provider type: OpenAI-compatible
- base URL: your BullSequana AI CoreAI API
- bearer token: JWT or
sk-bsq-...API key
Recommended Integration Rule
Use this rule for all developer-facing integrations:
Application or tool -> CoreAI API -> platform components
Not:
Application or tool -> LiteLLM
That keeps the developer contract stable even if the platform team changes the internal inference or proxy layer later.