# Use Local Models via API (/docs/development/use-local-models-via-api)


When developing against BullSequana AI, the preferred integration point is the `CoreAI API`.

Do not integrate directly with `LiteLLM` for application development. LiteLLM is an internal platform component and may change over time. The CoreAI API is the stable interface that BullSequana AI exposes to developers.

Recommended Background [#recommended-background]

This page is primarily for application developers and AI engineers integrating external tools or custom applications with the platform.

Helpful experience includes:

* API-based application development
* bearer-token or API-key authentication
* local development with Docker when reproducing platform-adjacent flows
* basic understanding of OpenAI-compatible clients and SDKs

What To Use [#what-to-use]

Use the API endpoint exposed by your BullSequana AI deployment:

```bash
export BSQAI_BASE_URL="https://llm-backend.<platform-domain>/v1"
```

The `llm-backend` subdomain stays the same across environments. Only the platform domain changes from one cluster to another.

This API already handles the platform integration behind the scenes:

* model routing
* authentication
* platform policy enforcement
* future component evolution behind the API boundary

Authentication Options [#authentication-options]

The backend supports both of these:

* `JWT bearer tokens`
* `BullSequana API keys` in the `sk-bsq-...` format

For interactive user access, JWT is a good fit.

For local developer tools, scripts, IDE plugins, and long-lived integrations, API keys are usually the cleaner option when your deployment exposes API key management.

Option 1: Authenticate with JWT [#option-1-authenticate-with-jwt]

If your platform uses Keycloak-backed interactive access, obtain a JWT and pass it as the bearer token.

Example:

```bash
export BSQAI_TOKEN="<jwt-token>"

curl "$BSQAI_BASE_URL/models" \
  -H "Authorization: Bearer $BSQAI_TOKEN"
```

In local backend development, the sibling `coreai-llm-backend` repo already includes a helper script:

```bash
./scripts/get_jwt_token.sh --print-token
```

Option 2: Create and use an API key [#option-2-create-and-use-an-api-key]

API keys are created through the CoreAI API itself and are returned only once when created.

First call the API with a valid JWT:

```bash
curl -X POST "https://llm-backend.<platform-domain>/v1/api-keys" \
  -H "Authorization: Bearer <jwt-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Local Development"
  }'
```

Typical response shape:

```json
{
  "id": "a37e6b2e-728e-4ffc-aea4-e5415616a96a",
  "name": "Local Development",
  "key": "sk-bsq-v1-...",
  "masked": "sk-bsq-v1-********-e5b4a3",
  "created_at": "2024-01-15T10:30:00Z"
}
```

Then use that key as the bearer token:

```bash
export BSQAI_API_KEY="sk-bsq-v1-..."

curl "$BSQAI_BASE_URL/models" \
  -H "Authorization: Bearer $BSQAI_API_KEY"
```

Discover Available Models [#discover-available-models]

Before wiring an application, check which models your deployment exposes:

Available models are also visible in the `CoreAI Portal`.

If you want the API-level source of truth for the current environment, use:

```bash
curl "$BSQAI_BASE_URL/models" \
  -H "Authorization: Bearer $BSQAI_API_KEY"
```

This is the correct source of truth for model names in your environment.

Use the OpenAI-Compatible API [#use-the-openai-compatible-api]

The CoreAI API exposes OpenAI-compatible endpoints, including:

* `/v1/models`
* `/v1/chat/completions`
* `/v1/responses`

That means many SDKs and tools can be pointed to BullSequana AI with only a base URL and token change.

Chat Completions example [#chat-completions-example]

```bash
curl -X POST "$BSQAI_BASE_URL/chat/completions" \
  -H "Authorization: Bearer $BSQAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<model-name-from-/v1/models>",
    "messages": [
      { "role": "user", "content": "Hello from BullSequana AI" }
    ]
  }'
```

Responses API example [#responses-api-example]

```python
from openai import OpenAI

client = OpenAI(
    api_key="sk-bsq-v1-...",
    base_url="https://llm-backend.<platform-domain>/v1",
)

response = client.responses.create(
    model="<model-name-from-/v1/models>",
    input="Explain the deployment model of BullSequana AI."
)

print(response)
```

IDE And Tooling Integrations [#ide-and-tooling-integrations]

For tools such as `Continue`, `OpenCode`, or custom internal applications, point the tool to the CoreAI API, not to LiteLLM directly.

The correct pattern is:

* provider type: OpenAI-compatible
* base URL: your BullSequana AI CoreAI API
* bearer token: JWT or `sk-bsq-...` API key

Recommended Integration Rule [#recommended-integration-rule]

Use this rule for all developer-facing integrations:

`Application or tool -> CoreAI API -> platform components`

Not:

`Application or tool -> LiteLLM`

That keeps the developer contract stable even if the platform team changes the internal inference or proxy layer later.

Related Pages [#related-pages]

* [CoreAI API](/docs/coreai/components/coreai-api)
* [Developer Tools](/docs/development/developer-tools)
* [Deploy apps on the platform](/docs/development/deploy-apps-on-the-platform)