Skip to main content

Intelligent Q&A Interface Documentation

Table of Contents

Interface Description

Creates a chat completion request that supports both streaming and non-streaming responses. This interface accepts a series of messages as input and returns completion content generated by the model. This protocol extends the OpenAI Chat Completions API with support for the reasoning_content field. For the latest and complete OpenAI Chat Completions API parameters, refer to OpenAI Chat Completions API.

Request

HTTP Request

POST /v1/chat/completions

Request Body

ParameterTypeRequiredDescription
modelstringYesThe ID of the model to use
messagesarrayYesArray of messages containing conversation history. Different message types (modalities) are supported depending on the model, such as text, images, and audio
toolsarrayNoList of tools that the model may call. Currently only functions are supported as tools. Use this parameter to provide a list of functions for which the model might generate JSON inputs. Supports up to 128 functions
tool_choicestring/objectNoControls whether the model calls a tool.
- "none": Model will not call any tools and will generate a message
- "auto": Model can choose between generating a message or calling one or more tools
- "required": Model must call one or more tools
- Can also specify a particular tool: {"type": "function", "function": {"name": "my_function"}}

Defaults to "none" when no tools are present; defaults to "auto" when tools are present
temperaturenumberNoSampling temperature, default is 1. Lower values make the output more deterministic
top_pnumberNoNucleus sampling probability mass, default is 1
nintegerNoNumber of chat completions to generate for each input message, default is 1
streambooleanNoWhether to enable streaming responses, default is false
stream_optionsobjectNoStreaming response options, only set when stream=true
stopstring/arrayNoUp to 4 sequences where the API will stop generating
max_tokensintegerNoMaximum number of tokens to generate
presence_penaltynumberNoPresence penalty, range -2.0 to 2.0, default is 0
frequency_penaltynumberNoFrequency penalty, range -2.0 to 2.0, default is 0
logit_biasobjectNoModifies the likelihood of specified tokens appearing in the completion
response_formatobjectNoSpecifies the format the model must output
seedintegerNoSeed value for deterministic sampling
parallel_tool_callsbooleanNoWhether to allow parallel tool calls
userstringNoA unique identifier representing the end user

Message Object Types

Developer Message

{
"role": "developer",
"content": "You are a helpful assistant focused on answering user questions."
}

Instructions provided by the developer that the model should follow regardless of what messages the user sends. In newer OpenAI models (such as the o1 series), developer messages replace the previous system messages.

System Message

{
"role": "system",
"content": "You are a helpful assistant focused on answering user questions."
}

Instructions provided by the developer that the model should follow regardless of what messages the user sends. In newer OpenAI models (such as the o1 series), it is recommended to use developer messages instead of system messages.

User Message

{
"role": "user",
"content": "Hello, please introduce yourself."
}

Messages sent by the end user, containing prompts or additional context information.

Assistant Message

{
"role": "assistant",
"content": "I am an AI assistant that can answer questions and provide information."
}

Messages sent by the model in response to user messages.

Tool Message

{
"role": "tool",
"content": "{\"temperature\":32,\"unit\":\"celsius\",\"description\":\"Sunny\",\"humidity\":45}",
"tool_call_id": "call_abc123"
}

Tool messages contain the results of tool calls and must include the following fields:

  • content: The content of the tool message (string)
  • role: The role of the message author, in this case "tool"
  • tool_call_id: The ID of the tool call this message is responding to

Tools and Function Definitions

Tool Object

{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather information for a specified city",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, such as Beijing, Shanghai"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
},
"strict": true
}
}

Tool objects contain the following fields:

  • type: Tool type, currently only "function" is supported
  • function: Function definition, containing the following fields:
    • name: Function name (required). Must contain a-z, A-Z, 0-9, or include underscores and hyphens, with a maximum length of 64
    • description: Description of the function's functionality (optional). The model uses this to decide when and how to call the function
    • parameters: Parameters accepted by the function, described as a JSON Schema object (optional)
    • strict: Whether to enable strict mode adherence when generating function calls (optional, default is false)

Tool Choice Object

Used to specify a particular tool:

{
"type": "function",
"function": {
"name": "get_weather"
}
}

Message Content Format

Text Content

{
"role": "user",
"content": "Hello, please introduce yourself."
}

Image Content

{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg",
"detail": "high"
}
}
]
}

Tool Call Content

{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Beijing\",\"unit\":\"celsius\"}"
}
}
]
}

Response

Non-streaming Response

{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o",
"system_fingerprint": "fp_44709d6fcb",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?",
"reasoning_content": "User greeted in Chinese, I should respond in Chinese."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}

Streaming Response

When stream=true, the server will send a series of Server-Sent Events (SSE), each containing a partial response. The format of each event is as follows:

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-3.5-turbo","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"reasoning_content":"User greeted in Chinese, "},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-3.5-turbo","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"reasoning_content":"I should respond in Chinese."},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-3.5-turbo","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-3.5-turbo","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-3.5-turbo","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

reasoning_content Field

Bella OpenAPI extends the standard OpenAI interface with the reasoning_content field to provide the model's reasoning process:

  • Only models that output reasoning processes will return the reasoning_content field
  • In non-streaming responses, reasoning_content is returned as a property of the Message object
  • In streaming responses, reasoning_content is returned in chunks through the delta object
  • The reasoning content helps developers understand the model's thought process and decision basis

Response Parameters

ParameterTypeDescription
idstringUnique identifier for the response
objectstringObject type, typically "chat.completion" or "chat.completion.chunk"
createdintegerTimestamp when the response was created
modelstringThe model used
system_fingerprintstringFingerprint of the backend configuration the model ran on
choicesarrayArray of completion choices
usageobjectToken usage statistics

Choice Object

ParameterTypeDescription
indexintegerIndex of the choice
messageobjectIn non-streaming responses, contains the complete message
deltaobjectIn streaming responses, contains the incremental part of the message
finish_reasonstringReason for completion, possible values include "stop", "length", "tool_calls", "content_filter", or null

Message/Delta Object

ParameterTypeDescription
rolestringRole of the message, typically "assistant"
contentstringContent of the message
reasoning_contentstringContains the model's reasoning process
tool_callsarrayList of tool calls

Usage Object

ParameterTypeDescription
prompt_tokensintegerNumber of tokens used in the prompt
completion_tokensintegerNumber of tokens used in the completion
total_tokensintegerTotal number of tokens used

Error Codes

Error CodeDescription
400Request parameter error
401Authentication failed, invalid API key
403Insufficient permissions, API key doesn't have permission to access the requested resource
404Requested resource does not exist
429Too many requests, exceeded rate limit
500Internal server error
503Service temporarily unavailable

Examples

Basic Request Example

{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello, please introduce yourself."
}
],
"temperature": 0.7,
"stream": false
}

Tool Call Request Example

{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "What's the weather like in Beijing today?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather information for a specified city",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, such as Beijing, Shanghai"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
},
"strict": true
}
}
],
"tool_choice": "auto"
}

Tool Call Response Example

When the model decides to call a tool, the response will include the tool_calls field:

{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o",
"system_fingerprint": "fp_44709d6fcb",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"reasoning_content": "User is asking about the weather in Beijing, I need to call",
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Beijing\",\"unit\":\"celsius\"}"
}
}
]
},
"finish_reason": "tool_calls"
}],
"usage": {
"prompt_tokens": 82,
"completion_tokens": 25,
"total_tokens": 107
}
}

Tool Result Submission Example

After obtaining the tool execution results, the results need to be submitted back to the conversation:

{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "What's the weather like in Beijing today?"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Beijing\",\"unit\":\"celsius\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "call_abc123",
"name": "get_weather",
"content": "{\"temperature\":32,\"unit\":\"celsius\",\"description\":\"Sunny\",\"humidity\":45}"
}
]
}

Tool Result Response Example

The model's response after processing the tool results:

{
"id": "chatcmpl-456",
"object": "chat.completion",
"created": 1677652290,
"model": "gpt-4o",
"system_fingerprint": "fp_44709d6fcb",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Beijing's weather today is sunny with a temperature of 32°C and humidity of 45%. It's quite hot, so I recommend using sun protection and staying hydrated.",
"reasoning_content": "Based on the data returned by the weather API, Beijing today has sunny weather, a temperature of 32 degrees Celsius, and humidity of 45%. This is relatively hot weather, so I should remind the user about sun protection and hydration."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 110,
"completion_tokens": 42,
"total_tokens": 152
}
}

Image Input Example

Support for including images in user messages:

{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg",
"detail": "high"
}
}
]
}
]
}

Image URL Format

Image URLs can be in one of the following formats:

  1. Internet-accessible URL:

    {
    "type": "image_url",
    "image_url": {
    "url": "https://example.com/image.jpg"
    }
    }
  2. Base64-encoded image data:

    {
    "type": "image_url",
    "image_url": {
    "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL..."
    }
    }
  3. Image detail level: The level of detail for image analysis can be specified using the detail parameter, with possible values:

    • "high": High detail analysis, suitable for scenarios requiring identification of small text or details
    • "low": Low detail analysis, faster processing and consumes fewer tokens
    • Defaults to "auto" if not specified

Image Input Response Example

{
"id": "chatcmpl-789",
"object": "chat.completion",
"created": 1677652295,
"model": "gpt-4o",
"system_fingerprint": "fp_44709d6fcb",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "This image shows an orange cat sitting on a windowsill, looking outside. Through the window, you can see some green trees and a blue sky. The cat looks relaxed, with its tail curled up next to its body.",
"reasoning_content": "The image contains an orange cat sitting on a windowsill looking outside. I can see trees and sky outside the window. The cat's posture indicates it is relaxed. I should describe in detail what I see, including the cat's color, position, surroundings, and posture."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 1042,
"completion_tokens": 65,
"total_tokens": 1107
}
}

Streaming Tool Call Example

When using tool calls in streaming mode, the response is returned across multiple events:

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"reasoning_content":"User is asking about the weather in Beijing, I need to call"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"reasoning_content":" the weather query function to get this information."},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_abc123","type":"function","function":{"name":"get_weather"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\""}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"location"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\":\""}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"Beijing"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\",\""}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"unit"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\":\""}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"celsius"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\"}"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","system_fingerprint":"fp_44709d6fcb","usage":{"prompt_tokens": 1042,"completion_tokens": 65,"total_tokens": 1107},"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}

data: [DONE]