Create chat completion

View as markdown
POST /z1/chat/completions

Main endpoint for AI chat interactions. Supports:

  • Multi-turn conversations with memory
  • RAG document retrieval
  • Streaming responses
  • Tool/function calling
  • Response caching
  • Web grounding

Plimver AI Models

  • PV-TURBO: Ultra-fast responses (<1s) - 1 credit/request
  • PV-STANDARD: Balanced performance - 2 credits/request
  • PV-ADVANCED: Deep reasoning - 5 credits/request
  • PV-CODEX: Specialized coding - 3 credits/request
application/json

Body Required

  • model string Required

    Plimver AI model to use

    Values are PV-TURBO, PV-STANDARD, PV-ADVANCED, or PV-CODEX.

  • messages array[object] Required

    Array of conversation messages

    Hide messages attributes Show messages attributes object
    • role string Required

      Message role

      Values are system, user, assistant, or function.

    • content string Required

      Message content

    • name string

      Name for function messages

  • user_id string Required

    Unique user identifier for memory isolation

  • temperature number(float)

    Sampling temperature

    Minimum value is 0, maximum value is 2. Default value is 0.7.

  • max_tokens integer

    Maximum tokens in response

    Minimum value is 1, maximum value is 4096. Default value is 2000.

  • stream boolean

    Enable Server-Sent Events streaming

    Default value is false.

  • use_rag boolean

    Enable RAG document retrieval

    Default value is false.

  • rag_k integer

    Number of RAG documents to retrieve (tier-dependent)

    Minimum value is 1, maximum value is 100. Default value is 5.

  • skip_cache boolean

    Bypass response cache

    Default value is false.

  • tools array[object]

    Function calling tools (OpenAI format)

Responses

  • Successful response

    Hide response attributes Show response attributes object
    • id string
    • object string
    • created integer(int64)
    • model string
    • choices array[object]
      Hide choices attributes Show choices attributes object
      • index integer
      • message object
        Hide message attributes Show message attributes object
        • role string Required

          Message role

          Values are system, user, assistant, or function.

        • content string Required

          Message content

        • name string

          Name for function messages

      • finish_reason string

        Values are stop, length, tool_calls, or content_filter.

    • usage object
      Hide usage attributes Show usage attributes object
      • prompt_tokens integer
      • completion_tokens integer
      • total_tokens integer
    • citations array[object]

      RAG citations (if use_rag=true)

    • memory_context object
      Hide memory_context attributes Show memory_context attributes object
      • items_retrieved integer
      • retrieval_time_ms integer

    Server-Sent Events stream (when stream=true)

  • 400 application/json

    Bad request - invalid parameters

    Hide response attributes Show response attributes object
    • status string
    • message string
    • code string
    • details object | null
  • 401 application/json

    Unauthorized - invalid or missing API key

    Hide response attributes Show response attributes object
    • status string
    • message string
    • code string
    • details object | null
  • 429 application/json

    Rate limit exceeded

    Hide headers attributes Show headers attributes
    • X-RateLimit-Limit integer

      Rate limit ceiling for this endpoint

    • X-RateLimit-Remaining integer

      Number of requests remaining

    • X-RateLimit-Reset integer

      Timestamp when rate limit resets

    • Retry-After integer

      Seconds to wait before retrying

    Hide response attributes Show response attributes object
    • status string
    • message string
    • code string
    • details object | null
  • 500 application/json

    Internal server error

    Hide response attributes Show response attributes object
    • status string
    • message string
    • code string
    • details object | null
POST /z1/chat/completions
curl \
 --request POST 'https://zenux-api.redglacier-fb4abe56.southafricanorth.azurecontainerapps.io/z1/chat/completions' \
 --header "Authorization: Bearer $ACCESS_TOKEN" \
 --header "Content-Type: application/json" \
 --data '{"model":"PV-TURBO","user_id":"user_123","messages":[{"role":"user","content":"What is the capital of France?"}]}'
Request examples
{
  "model": "PV-TURBO",
  "user_id": "user_123",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ]
}
{
  "model": "PV-STANDARD",
  "rag_k": 5,
  "use_rag": true,
  "user_id": "user_123",
  "messages": [
    {
      "role": "user",
      "content": "What does our privacy policy say about data retention?"
    }
  ]
}
{
  "model": "PV-TURBO",
  "stream": true,
  "user_id": "user_123",
  "messages": [
    {
      "role": "user",
      "content": "Write a short story"
    }
  ]
}
Response examples (200)
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "PV-TURBO",
  "choices": [
    {
      "index": 42,
      "message": {
        "role": "system",
        "content": "string",
        "name": "string"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 42,
    "total_tokens": 42
  },
  "citations": [
    {}
  ],
  "memory_context": {
    "items_retrieved": 42,
    "retrieval_time_ms": 42
  }
}
Response examples (400)
{
  "code": "INVALID_MODEL",
  "status": "error",
  "message": "Invalid model specified"
}
Response examples (401)
{
  "code": "UNAUTHORIZED",
  "status": "error",
  "message": "Invalid API key"
}
Response examples (429)
{
  "code": "RATE_LIMIT_EXCEEDED",
  "status": "error",
  "message": "Rate limit exceeded: 1000 requests per minute"
}
Response examples (500)
{
  "code": "INTERNAL_ERROR",
  "status": "error",
  "message": "An unexpected error occurred"
}