Chat Completions

Create a chat completion. OpenAI-compatible.

POST/chat/completions

Parameter	Type	Required	Description
model	string	Yes	Model ID (e.g. deepseek/deepseek-chat)
messages	array	Yes	Array of message objects with role and content
temperature	number	No	Sampling temperature (0-2). Default: 1
top_p	number	No	Nucleus sampling. Default: 1
max_tokens	integer	No	Maximum tokens to generate
stream	boolean	No	Stream response via SSE. Default: false
stop	string\|array	No	Stop sequences
frequency_penalty	number	No	Frequency penalty (-2 to 2). Default: 0
presence_penalty	number	No	Presence penalty (-2 to 2). Default: 0
tools	array	No	Function calling tools (OpenAI format)
tool_choice	string\|object	No	Tool selection mode

curl https://dragonfly-api.com/v1/chat/completions \
  -H "Authorization: Bearer sk-df-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-chat",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is 2+2?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1709000000,
  "model": "deepseek/deepseek-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "2 + 2 = 4"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 7,
    "total_tokens": 25,
    "cost_usd": 0.000022
  }
}

Streaming

Set stream: true to receive responses as Server-Sent Events (SSE). Each chunk is a JSON object prefixed with data:. The stream ends with data: [DONE].

curl https://dragonfly-api.com/v1/chat/completions \
  -H "Authorization: Bearer sk-df-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-chat",
    "stream": true,
    "messages": [{"role": "user", "content": "Write a haiku"}]
  }'

Tool Calling

Pass an array of tool definitions. The model returns tool_calls in the response when it decides to invoke a tool. Compatible with the OpenAI function calling format.

Parameter	Type	Required	Description
tools	array	No	Array of tool definitions. Each tool has type: "function" and a function object.
tool_choice	string\|object	No	"auto" (default), "none", or {"type":"function","function":{"name":"..."}}
function.name	string	Yes	Function name (no spaces, letters/digits/underscores only)
function.description	string	No	Description of when to call this function
function.parameters	object	No	JSON Schema object describing the function parameters

Response

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"Shanghai\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Embeddings

Generate vector embeddings from text. Compatible with the OpenAI embeddings API format.

POST/embeddings

Parameter	Type	Required	Description
model	string	Yes	Embedding model ID (e.g. text-embedding-3-small)
input	string\|array	Yes	String or array of strings to embed
encoding_format	string	No	"float" (default) or "base64"
dimensions	integer	No	Reduce output dimensions (model-dependent)

Market Routing

Dragonfly operates a market where third-party providers contribute API keys at discounted prices.

Parameter	Type	Required	Description
Routing-Strategy	header	No	Use marketplace seller capacity only. Fails with 503 if no eligible seller is available. Requires accepted Trading Terms.

curl https://dragonfly-api.com/v1/chat/completions \
  -H "Authorization: Bearer sk-df-xxx" \
  -H "Content-Type: application/json" \
  -H "Routing-Strategy: market-only" \
  -d '{
    "model": "deepseek/deepseek-chat",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Parameter	Type	Required	Description
X-Max-Price	header	No	Max acceptable input-side price (hard filter).
X-Max-TTFT-Ms	header	No	Preferred max TTFT (soft preference).
X-Min-Throughput-TPS	header	No	Preferred minimum throughput (soft preference).
X-Hard-Max-TTFT-Ms	header	No	Strict max TTFT filter.
X-Hard-Min-Throughput-TPS	header	No	Strict minimum throughput filter.
X-Privacy-Mode	header	No	Privacy filter: trusted_only.
X-Provider-Only	header	No	Comma-separated provider allowlist (hard filter).
X-Provider-Ignore	header	No	Comma-separated provider denylist (hard filter).
X-Provider-Order	header	No	Comma-separated provider preference order (soft ranking).
X-DF-Market-Route	response header	No	seller.
X-DF-Market-Provider	response header	No	Matched provider slug.
X-DF-Market-Listing-Id	response header	No	Matched listing ID (seller route).
X-DF-Market-Request-Id	response header	No	Matching request ID.

List Models

List all available models.

GET/models

curl https://dragonfly-api.com/v1/models

Audio Speech (TTS)

Generate speech from text. Powered by MiniMax.

POST/audio/speech

Parameter	Type	Required	Description
model	string	Yes	minimax/speech-02-hd or minimax/speech-02-turbo
input	string	Yes	Text to speak
voice	string	No	Voice ID. Default: alloy
speed	number	No	Speed (0.25-4.0). Default: 1.0

Models & Pricing

Prices per million tokens.

DeepSeek

Model ID	Input	Output	Context	Capabilities
deepseek/deepseek-chat	$0.50	$1.50	128K	Chat, Code
deepseek/deepseek-reasoner	$2.00	$8.00	128K	Reasoning

MiniMax

Model ID	Input	Output	Context	Capabilities
minimax/minimax-m2.5	$0.80	$2.40	1M	Chat
minimax/minimax-m2.5-highspeed	$0.40	$1.20	1M	Chat (fast)
minimax/speech-02-hd	—	—	—	TTS (HD)
minimax/speech-02-turbo	—	—	—	TTS (fast)

Rate Limits

Rate limits apply per API key. When exceeded, the response is 429 with type rate_limit_exceeded or insufficient_quota. Retry using the Retry-After header.

Header	Meaning
X-RateLimit-Limit	Request limit for the current window
X-RateLimit-Remaining	Requests remaining in current window
X-RateLimit-Reset	Unix timestamp when the window resets
Retry-After	Seconds to wait before retrying (on 429)

Error Codes

Code	Type	Description
400	bad_request	Invalid request parameters
401	unauthorized	Missing or invalid API key
403	forbidden	Access denied (trading terms not accepted, account suspended, or permission error)
404	not_found	Model or resource not found
429	rate_limit_exceeded	Too many requests. Retry after X-RateLimit-Reset.
429	insufficient_quota	Insufficient credits. Response includes top_up_url.
500	internal_error	Server error. Retry with exponential backoff.
502	upstream_error	Upstream provider error
503	market_unavailable	No eligible market seller available. Retry or check seller supply.

Error Response Format

Response

{
  "error": {
    "message": "Insufficient credits",
    "type": "insufficient_quota",
    "code": 429,
    "top_up_url": "https://dragonfly-api.com/dashboard/billing"
  }
}