Create chat completion

View as markdown

POST /z1/chat/completions

Bearer auth Api key auth

Main endpoint for AI chat interactions. Supports:

Multi-turn conversations with memory
RAG document retrieval
Streaming responses
Tool/function calling
Response caching
Web grounding

Plimver AI Models

PV-TURBO: Ultra-fast responses (<1s) - 1 credit/request
PV-STANDARD: Balanced performance - 2 credits/request
PV-ADVANCED: Deep reasoning - 5 credits/request
PV-CODEX: Specialized coding - 3 credits/request

application/json

Body Required

model string Required

Plimver AI model to use

Values are PV-TURBO, PV-STANDARD, PV-ADVANCED, or PV-CODEX.
messages array[object] Required

Array of conversation messages
Hide messages attributes Show messages attributes object
- role string Required
  
  Message role
  
  Values are system, user, assistant, or function.
- content string Required
  
  Message content
- name string
  
  Name for function messages
user_id string Required

Unique user identifier for memory isolation
temperature number(float)

Sampling temperature

Minimum value is 0, maximum value is 2. Default value is 0.7.
max_tokens integer

Maximum tokens in response

Minimum value is 1, maximum value is 4096. Default value is 2000.
stream boolean

Enable Server-Sent Events streaming

Default value is false.
use_rag boolean

Enable RAG document retrieval

Default value is false.
rag_k integer

Number of RAG documents to retrieve (tier-dependent)

Minimum value is 1, maximum value is 100. Default value is 5.
skip_cache boolean

Bypass response cache

Default value is false.
tools array[object]

Function calling tools (OpenAI format)

Responses

200

application/json

application/json text/event-stream

Successful response
Hide response attributes Show response attributes object
- id string
- object string
- created integer(int64)
- model string
- choices array[object]
  
  Hide choices attributes Show choices attributes object
  
  index integer
  
  message object
  
  Hide message attributes Show message attributes object
  
  role string Required
  
  Message role
  
  Values are system, user, assistant, or function.
  
  content string Required
  
  Message content
  
  name string
  
  Name for function messages
  
  finish_reason string
  
  Values are stop, length, tool_calls, or content_filter.
- usage object
  
  Hide usage attributes Show usage attributes object
  
  prompt_tokens integer
  
  completion_tokens integer
  
  total_tokens integer
- citations array[object]
  
  RAG citations (if use_rag=true)
- memory_context object
  
  Hide memory_context attributes Show memory_context attributes object
  
  items_retrieved integer
  
  retrieval_time_ms integer
Server-Sent Events stream (when stream=true)
400 application/json

Bad request - invalid parameters
Hide response attributes Show response attributes object
- status string
- message string
- code string
- details object | null
401 application/json

Unauthorized - invalid or missing API key
Hide response attributes Show response attributes object
- status string
- message string
- code string
- details object | null
429 application/json

Rate limit exceeded
Hide headers attributes Show headers attributes
- X-RateLimit-Limit integer
  
  Rate limit ceiling for this endpoint
- X-RateLimit-Remaining integer
  
  Number of requests remaining
- X-RateLimit-Reset integer
  
  Timestamp when rate limit resets
- Retry-After integer
  
  Seconds to wait before retrying
Hide response attributes Show response attributes object
- status string
- message string
- code string
- details object | null
500 application/json

Internal server error
Hide response attributes Show response attributes object
- status string
- message string
- code string
- details object | null

POST /z1/chat/completions

curl \
 --request POST 'https://zenux-api.redglacier-fb4abe56.southafricanorth.azurecontainerapps.io/z1/chat/completions' \
 --header "Authorization: Bearer $ACCESS_TOKEN" \
 --header "Content-Type: application/json" \
 --data '{"model":"PV-TURBO","user_id":"user_123","messages":[{"role":"user","content":"What is the capital of France?"}]}'

Request examples

{
  "model": "PV-TURBO",
  "user_id": "user_123",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ]
}

{
  "model": "PV-STANDARD",
  "rag_k": 5,
  "use_rag": true,
  "user_id": "user_123",
  "messages": [
    {
      "role": "user",
      "content": "What does our privacy policy say about data retention?"
    }
  ]
}

{
  "model": "PV-TURBO",
  "stream": true,
  "user_id": "user_123",
  "messages": [
    {
      "role": "user",
      "content": "Write a short story"
    }
  ]
}

Response examples (200)

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "PV-TURBO",
  "choices": [
    {
      "index": 42,
      "message": {
        "role": "system",
        "content": "string",
        "name": "string"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 42,
    "total_tokens": 42
  },
  "citations": [
    {}
  ],
  "memory_context": {
    "items_retrieved": 42,
    "retrieval_time_ms": 42
  }
}

Response examples (400)

{
  "code": "INVALID_MODEL",
  "status": "error",
  "message": "Invalid model specified"
}

Response examples (401)

{
  "code": "UNAUTHORIZED",
  "status": "error",
  "message": "Invalid API key"
}

Response examples (429)

{
  "code": "RATE_LIMIT_EXCEEDED",
  "status": "error",
  "message": "Rate limit exceeded: 1000 requests per minute"
}

Response examples (500)

{
  "code": "INTERNAL_ERROR",
  "status": "error",
  "message": "An unexpected error occurred"
}