Augure API
OpenAI-compatible chat completions API on sovereign Canadian infrastructure. No US exposure. No CLOUD Act.
Base URL: https://api.augureai.caData Routing & Residency
All API requests enter through our gateway on OVHcloud infrastructure in Beauharnois, Quebec. Inference runs on sovereign infrastructure with no US parent company in the stack — no data touches US infrastructure at any point. Ossington 4 runs on Canadian GPU infrastructure (Denvr, Calgary). Prompts are encrypted in transit (TLS 1.2+), never logged by Augure, and never used for model training.
Authentication
All API endpoints require a Bearer token. Include your API key in the Authorization header of every request.
curl https://api.augureai.ca/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"Getting a key: API keys are issued through our gated application process. Apply for access to get started.
Models
Three models are available, optimized for different workloads.
ossington-4Large model, highest capability
Complex reasoning, legal analysis, document review
tofino-2.5Fast, efficient small model
Chat, summaries, quick tasks
augure-nanoCompact 8B model
Classification, extraction, simple tasks
OpenAI compatibility: The aliases gpt-4, gpt-4o, gpt-4o-mini, and gpt-3.5-turbo are supported for drop-in compatibility with OpenAI client libraries. They map to ossington-4 and tofino-2.5 respectively.
Endpoints
/v1/chat/completionsCreate a chat completion. Accepts the same request format as the OpenAI chat completions endpoint.
Parameters
| Field | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID (see Models above) |
| messages | array | Yes | Array of message objects |
| stream | boolean | No | Stream response via SSE. Default: false |
| temperature | number | No | Sampling temperature (0.0–2.0) |
| max_tokens | number | No | Max tokens to generate (up to 32,768) |
| top_p | number | No | Nucleus sampling threshold |
| stop | string | array | No | Stop sequence(s) |
Each message in the messages array has a role ("system", "user", or "assistant") and a content string.
Example Request
curl -X POST https://api.augureai.ca/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "ossington-4",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the Civil Code of Quebec?"}
]
}'Example Response
{
"id": "chatcmpl-a9adf17e-5ff3-4804-b01e-f7cbd30ae996",
"object": "chat.completion",
"created": 1771286577,
"model": "ossington-4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The Civil Code of Quebec (Code civil du Québec) is..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 150,
"total_tokens": 174
},
"_augure": {
"gateway_region": "ca-montreal-1",
"inference_region": "augure-cloud",
"request_id": "a9adf17e-5ff3-4804-b01e-f7cbd30ae996"
}
}Streaming
Set "stream": true to receive Server-Sent Events. Each event is a JSON chunk with a delta object containing incremental content. The stream ends with data: [DONE].
curl -N -X POST https://api.augureai.ca/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "tofino-2.5",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}'/v1/modelsReturns a list of all available models.
curl https://api.augureai.ca/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"Response
{
"object": "list",
"data": [
{ "id": "ossington-4", "object": "model", "owned_by": "augure" },
{ "id": "tofino-2.5", "object": "model", "owned_by": "augure" },
{ "id": "augure-nano", "object": "model", "owned_by": "augure" }
]
}Client Libraries
Use any OpenAI-compatible SDK. Just point it to https://api.augureai.ca/v1 as the base URL.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.augureai.ca/v1"
)
response = client.chat.completions.create(
model="ossington-4",
messages=[
{"role": "user", "content": "Explain Quebec privacy law"}
]
)
print(response.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://api.augureai.ca/v1",
});
const response = await client.chat.completions.create({
model: "tofino-2.5",
messages: [{ role: "user", content: "Summarize PIPEDA" }],
});
console.log(response.choices[0].message.content);Limits
Request body
2 MB max
Messages per request
256 max
Max output tokens
32,768
Request timeout
300 seconds
Token quotas are applied per API key. Contact us if you need higher throughput for production workloads.
Errors
All errors return a JSON object with an error field, matching the OpenAI error format.
{
"error": {
"message": "Invalid API key provided",
"type": "invalid_request_error",
"param": null,
"code": "invalid_api_key"
}
}| Status | Meaning |
|---|---|
| 401 | Missing or invalid API key |
| 400 | Malformed request or missing required fields |
| 404 | Unknown model or endpoint |
| 413 | Request body exceeds 2 MB |
| 429 | Token quota exceeded for this API key |
| 502 | Upstream processing error — retry shortly |