Skip to main content
Version: Next

Text Generation & Prompting

ASI:One’s Chat Completion endpoint lets you turn plain instructions (a prompt) into rich text—code, math, structured JSON, or natural-sounding prose. This page shows how to call the endpoint, explains every request field, and lists common errors.


Endpoint

POST https://api.asi1.ai/v1/chat/completions

Required headers

HeaderTypeDescription
AuthorizationstringBearer YOUR_API_KEY
x-session-idstringA unique session identifier used for rate-limiting & tracing

Request body

FieldTypeRequiredDescription
agent_addressstringoptionalAddress of the calling agent (for rate-limits & audit).
modelstringoptionalE.g. asi1
messagesarray<object>optionalConversation history (see below)
temperaturenumberoptional0-2. Controls randomness. Default 1.0
max_tokensintegeroptionalMax tokens to generate
streambooleanoptionalIf true the response is Server-Sent Events
toolsarray<object>optionalFor tool-calling
web_searchbooleanoptionalEnable/disable web search

Message objects follow the OpenAI style:

{
"role": "user | assistant | system",
"content": "…"
}

Quick start (choose your language)

curl -X POST https://api.asi1.ai/v1/chat/completions \
-H "Authorization: Bearer $ASI_ONE_API_KEY" \
-H "x-session-id: $(uuidgen)" \
-H "Content-Type: application/json" \
-d '{
"model": "asi1",
"messages": [
{"role": "user", "content": "Write a one-sentence bedtime story about a unicorn."}
]
}'

Response schema (200)

{
"id": "083bd6ca857843c0911edf83799ca6c8",
"choices": [{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "A little unicorn with a silver mane pranced through clouds of cotton candy, then curled up in a meadow of starlight to sleep while her horn glowed softly like a night-light.",
"refusal": null,
"role": "assistant",
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": null,
"reasoning_content": null
},
"matched_stop": 151336
}],
"created": 1768476062,
"model": "asi1",
"object": "chat.completion",
"service_tier": null,
"system_fingerprint": null,
"usage": {
"completion_tokens": 39,
"prompt_tokens": 2107,
"total_tokens": 2146,
"completion_tokens_details": null,
"prompt_tokens_details": null,
"reasoning_tokens": 0
},
"metadata": {
"weight_version": "default"
}
}

Field breakdown

FieldTypeDescription
idstringUnique identifier for the completion.
modelstringModel that generated the response.
choicesarray | nullList of generated choices/messages.
executable_dataarray | nullStructured tool calls or agent manifests (agentic models).
intermediate_stepsarray | nullInternal reasoning breadcrumbs (streaming/extended).
conversation_idstring | nullID you can supply to continue a thread (optional).
thoughtarray | nullLightweight reasoning trace returned during streaming.
usageobject | nullToken-usage accounting.
choices[] object
FieldTypeDescription
indexintegerPosition in the choices array.
finish_reasonstringWhy generation stopped (stop, length, etc.).
messageobject | nullAssistant message when not streaming.
deltaobject | nullIncremental message chunk when stream=true.
message / delta object
FieldTypeDescription
rolestringAlways assistant for model output.
contentstringText produced so far (streaming) or full text (non-stream).
usage object
FieldTypeDescription
prompt_tokensintegerNumber of tokens you sent.
completion_tokensintegerTokens generated by the model.
total_tokensintegerSum of prompt + completion (used for billing/limits).

Error codes

CodeMeaning
400Bad request (invalid JSON, missing fields)
404Not found (unknown endpoint)
429Rate limit exceeded
500Internal server error

Prompt engineering basics

  1. Use rolessystem for global instructions, user for questions.
  2. Be explicit – tell the model what format you expect.
  3. Few-shot examples – include 2-3 Q&A pairs to steer style.
  4. Temperature – lower (0-0.3) for deterministic, higher for creativity.
  5. Pin versions – specify model snapshot once versioning is exposed.