Skip to main content

Text-to-Image API Documentation

Overview

The Text-to-Image API provides the capability to generate images based on text prompts, compatible with the OpenAI Images API specification. It supports multiple models and flexible parameter configurations to generate images of different qualities, sizes, and styles.

Endpoint

POST /v1/images/generations

Supported Models

ModelDescriptionMax Prompt LengthSupported Sizes
dall-e-2DALL·E 2 model, lower cost1,000 characters256x256, 512x512, 1024x1024
dall-e-3DALL·E 3 model, higher quality4,000 characters1024x1024, 1792x1024, 1024x1792
gpt-image-1GPT Image 1 model, supports more customization options32,000 charactersSupports multiple sizes
doubao-seedream-3-0-t2iHuoshan platform, supports specific optionsNot specifiedSupports multiple sizes

Request Format

Request Headers

Content-Type: application/json
Authorization: Bearer YOUR_API_KEY

Request Body Parameters

ParameterTypeRequiredDescription
promptstringImage description text. Length limit see table above
modelstringModel name to use
nintegerNumber of images to generate, range 1-10. dall-e-3 only supports n=1
qualitystringImage quality. Only dall-e-3 supports: standard (default), hd
response_formatstringReturn format: url (default), b64_json
sizestringImage size, see supported models table
stylestringImage style. Only dall-e-3 supports: vivid (default), natural
userstringUser identifier for monitoring and abuse prevention

GPT Image 1 Specific Parameters

ParameterTypeRequiredDescription
backgroundstringBackground transparency: auto (default), transparent, opaque
moderationstringContent moderation level: auto (default), low
output_compressionintegerOutput compression level, 0-100%, default 100
output_formatstringOutput format: png, jpeg, webp

Volcano Engine Protocol Extension Parameters

ParameterTypeRequiredDescription
watermarkbooleanWhether to add watermark
seedintegerRandom seed, range [-1, 2147483647]
guidance_scalefloatGeneration freedom, range [1, 10]

Response Format

Success Response

{
"created": 1677649296,
"background": "transparent",
"data": [
{
"b64_json": null,
"url": "https://example.com/image.png",
"revised_prompt": "Revised prompt",
"output_format": "png",
"quality": "standard",
"size": "1024x1024"
}
],
"usage": {
"prompt_tokens": 50,
"input_tokens_details": {
"text_tokens": 40,
"image_tokens": 10
}
}
}

Response Field Description

FieldTypeDescription
createdintegerUnix timestamp
backgroundstringBackground parameter (gpt-image-1 only)
dataarrayList of generated images
usageobjectToken usage statistics (gpt-image-1 only)

ImageData Object

FieldTypeDescription
b64_jsonstringBase64 encoded image data (when response_format=b64_json)
urlstringImage URL (when response_format=url)
revised_promptstringRevised prompt
output_formatstringOutput format (gpt-image-1 only)
qualitystringImage quality
sizestringImage size

Usage Object (gpt-image-1 only)

FieldTypeDescription
prompt_tokensintegerTotal input token count
input_tokens_detailsobjectDetailed token statistics

InputTokensDetails Object

FieldTypeDescription
text_tokensintegerText token count
image_tokensintegerImage token count

Request Examples

Basic Example (DALL·E 2)

curl -X POST "https://api.example.com/v1/images/generations" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "A cute orange cat sitting in sunlight",
"model": "dall-e-2",
"n": 1,
"size": "1024x1024",
"response_format": "url"
}'

High Quality Example (DALL·E 3)

curl -X POST "https://api.example.com/v1/images/generations" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Futuristic cityscape with neon lights, cyberpunk style",
"model": "dall-e-3",
"quality": "hd",
"style": "vivid",
"size": "1792x1024"
}'

GPT Image 1 Advanced Example

curl -X POST "https://api.example.com/v1/images/generations" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "A blooming magnolia flower with blurred background, photography style",
"model": "gpt-image-1",
"background": "transparent",
"output_format": "png",
"output_compression": 85,
"moderation": "low",
"n": 2
}'

JavaScript Example

const response = await fetch('https://api.example.com/v1/images/generations', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
prompt: 'Van Gogh style starry night over a small town',
model: 'dall-e-3',
quality: 'hd',
size: '1024x1024'
})
});

const data = await response.json();
console.log('Generated image URL:', data.data[0].url);

Python Example

import requests
import json

url = "https://api.example.com/v1/images/generations"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}

payload = {
"prompt": "Watercolor style landscape painting with profound artistic conception",
"model": "dall-e-3",
"quality": "standard",
"style": "natural",
"size": "1024x1024"
}

response = requests.post(url, headers=headers, data=json.dumps(payload))
result = response.json()

print(f"Generated image URL: {result['data'][0]['url']}")

Error Handling

Common Error Codes

Status CodeError TypeDescriptionSolution
400invalid_request_errorInvalid request parametersCheck required parameters and parameter format
401authentication_errorInvalid API KeyCheck Authorization header
429rate_limit_exceededRequest rate too highReduce request frequency or upgrade plan
500server_errorInternal server errorRetry later

Error Response Example

{
"error": {
"code": "invalid_request_error",
"message": "Invalid model specified",
"param": "model",
"type": "invalid_request_error"
}
}

Billing Information

Billing Models

Different models use different billing methods:

DALL·E 2 and Doubao

  • Billed by number of generated images
  • Different prices for different sizes

DALL·E 3

  • Billed by number of generated images and quality
  • Supports standard quality and HD quality

GPT Image 1

  • Image Generation Fee: Billed by number of images, quality, and size
  • Token Fee: Billed by input token count (including text and image tokens)

Price Examples

The following are example prices (actual prices subject to platform announcement):

ModelQualitySizePrice/Image
DALL·E 2Standard256x256$0.016
DALL·E 2Standard512x512$0.018
DALL·E 2Standard1024x1024$0.020
DALL·E 3Standard1024x1024$0.040
DALL·E 3HD1024x1024$0.080
GPT Image 1Low Quality1024x1024$0.020
GPT Image 1Medium Quality1024x1024$0.040
GPT Image 1High Quality1024x1024$0.080

Token Billing (GPT Image 1)

  • Text Tokens: $0.002 / 1K tokens
  • Image Tokens: $0.010 / 1K tokens

Best Practices

1. Prompt Optimization

  • Specific Description: Use specific, detailed descriptions rather than vague concepts
  • Style Guidance: Clearly specify artistic style, photography style, or painting technique
  • Composition Description: Describe desired composition, angle, and perspective
  • Color and Lighting: Specify tone, lighting conditions, and atmosphere
// Before optimization
"A cat"

// After optimization
"An orange short-haired cat sitting on a sunlit windowsill, with green plants in the background, warm tones, natural lighting, photography style"

2. Parameter Selection Recommendations

  • Quality Settings:

    • Standard quality suitable for most scenarios
    • HD quality for scenes requiring fine details
  • Size Selection:

    • Social media: 1024x1024
    • Banner ads: 1792x1024
    • Vertical posters: 1024x1792

3. Performance Optimization

  • Batch Generation: Use n parameter to generate multiple images at once
  • Format Selection: URL format transfers faster, Base64 format convenient for direct processing
  • Compression Settings: Adjust compression level based on usage to balance quality and file size

4. Content Safety

  • Follow platform content policies
  • Use appropriate moderation levels
  • Avoid descriptions that might trigger safety filters

Frequently Asked Questions

Q: Why doesn't the generated image exactly match the prompt?

A: AI models interpret and understand prompts, which may involve some creative interpretation. You can improve matching by using more specific descriptions.

Q: How to improve consistency in image generation?

A:

  • Use the same seed value (Volcano Engine protocol)
  • Maintain consistency in prompts
  • Use the same model and parameters

Q: Can generated images be used commercially?

A: Please refer to specific terms of service and usage licenses. Generally, users have usage rights to generated images.

Q: How to handle generation failures?

A:

  • Check if prompts violate content policies
  • Simplify complex prompts
  • Retry or adjust parameters
  • Contact technical support

Note: This documentation is written based on the current API version. Specific features and pricing may change with version updates. Please refer to the latest official documentation.