Chat Completions
Create a chat completion. OpenAI-compatible.
/chat/completions| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID (e.g. deepseek/deepseek-chat) |
| messages | array | Yes | Array of message objects with role and content |
| temperature | number | No | Sampling temperature (0-2). Default: 1 |
| top_p | number | No | Nucleus sampling. Default: 1 |
| max_tokens | integer | No | Maximum tokens to generate |
| stream | boolean | No | Stream response via SSE. Default: false |
| stop | string|array | No | Stop sequences |
| frequency_penalty | number | No | Frequency penalty (-2 to 2). Default: 0 |
| presence_penalty | number | No | Presence penalty (-2 to 2). Default: 0 |
| tools | array | No | Function calling tools (OpenAI format) |
| tool_choice | string|object | No | Tool selection mode |
curl https://dragonfly-api.com/v1/chat/completions \
-H "Authorization: Bearer sk-df-xxx" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-chat",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 2+2?"}
],
"temperature": 0.7,
"max_tokens": 1024
}'{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1709000000,
"model": "deepseek/deepseek-chat",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "2 + 2 = 4"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 18,
"completion_tokens": 7,
"total_tokens": 25,
"cost_usd": 0.000022
}
}Streaming
Set stream: true to receive responses as Server-Sent Events (SSE). Each chunk is a JSON object prefixed with data:. The stream ends with data: [DONE].
curl https://dragonfly-api.com/v1/chat/completions \
-H "Authorization: Bearer sk-df-xxx" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-chat",
"stream": true,
"messages": [{"role": "user", "content": "Write a haiku"}]
}'Tool Calling
Pass an array of tool definitions. The model returns tool_calls in the response when it decides to invoke a tool. Compatible with the OpenAI function calling format.
| Parameter | Type | Required | Description |
|---|---|---|---|
| tools | array | No | Array of tool definitions. Each tool has type: "function" and a function object. |
| tool_choice | string|object | No | "auto" (default), "none", or {"type":"function","function":{"name":"..."}} |
| function.name | string | Yes | Function name (no spaces, letters/digits/underscores only) |
| function.description | string | No | Description of when to call this function |
| function.parameters | object | No | JSON Schema object describing the function parameters |
{
"choices": [
{
"message": {
"role": "assistant",
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Shanghai\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}Embeddings
Generate vector embeddings from text. Compatible with the OpenAI embeddings API format.
/embeddings| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Embedding model ID (e.g. text-embedding-3-small) |
| input | string|array | Yes | String or array of strings to embed |
| encoding_format | string | No | "float" (default) or "base64" |
| dimensions | integer | No | Reduce output dimensions (model-dependent) |
Market Routing
Dragonfly operates a market where third-party providers contribute API keys at discounted prices.
| Parameter | Type | Required | Description |
|---|---|---|---|
| Routing-Strategy | header | No | Use marketplace seller capacity only. Fails with 503 if no eligible seller is available. Requires accepted Trading Terms. |
curl https://dragonfly-api.com/v1/chat/completions \
-H "Authorization: Bearer sk-df-xxx" \
-H "Content-Type: application/json" \
-H "Routing-Strategy: market-only" \
-d '{
"model": "deepseek/deepseek-chat",
"messages": [{"role": "user", "content": "Hello!"}]
}'| Parameter | Type | Required | Description |
|---|---|---|---|
| X-Max-Price | header | No | Max acceptable input-side price (hard filter). |
| X-Max-TTFT-Ms | header | No | Preferred max TTFT (soft preference). |
| X-Min-Throughput-TPS | header | No | Preferred minimum throughput (soft preference). |
| X-Hard-Max-TTFT-Ms | header | No | Strict max TTFT filter. |
| X-Hard-Min-Throughput-TPS | header | No | Strict minimum throughput filter. |
| X-Privacy-Mode | header | No | Privacy filter: trusted_only. |
| X-Provider-Only | header | No | Comma-separated provider allowlist (hard filter). |
| X-Provider-Ignore | header | No | Comma-separated provider denylist (hard filter). |
| X-Provider-Order | header | No | Comma-separated provider preference order (soft ranking). |
| X-DF-Market-Route | response header | No | seller. |
| X-DF-Market-Provider | response header | No | Matched provider slug. |
| X-DF-Market-Listing-Id | response header | No | Matched listing ID (seller route). |
| X-DF-Market-Request-Id | response header | No | Matching request ID. |
List Models
List all available models.
/modelscurl https://dragonfly-api.com/v1/modelsAudio Speech (TTS)
Generate speech from text. Powered by MiniMax.
/audio/speech| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | minimax/speech-02-hd or minimax/speech-02-turbo |
| input | string | Yes | Text to speak |
| voice | string | No | Voice ID. Default: alloy |
| speed | number | No | Speed (0.25-4.0). Default: 1.0 |
Models & Pricing
Prices per million tokens.
DeepSeek
| Model ID | Input | Output | Context | Capabilities |
|---|---|---|---|---|
| deepseek/deepseek-chat | $0.50 | $1.50 | 128K | Chat, Code |
| deepseek/deepseek-reasoner | $2.00 | $8.00 | 128K | Reasoning |
MiniMax
| Model ID | Input | Output | Context | Capabilities |
|---|---|---|---|---|
| minimax/minimax-m2.5 | $0.80 | $2.40 | 1M | Chat |
| minimax/minimax-m2.5-highspeed | $0.40 | $1.20 | 1M | Chat (fast) |
| minimax/speech-02-hd | — | — | — | TTS (HD) |
| minimax/speech-02-turbo | — | — | — | TTS (fast) |
Rate Limits
Rate limits apply per API key. When exceeded, the response is 429 with type rate_limit_exceeded or insufficient_quota. Retry using the Retry-After header.
| Header | Meaning |
|---|---|
| X-RateLimit-Limit | Request limit for the current window |
| X-RateLimit-Remaining | Requests remaining in current window |
| X-RateLimit-Reset | Unix timestamp when the window resets |
| Retry-After | Seconds to wait before retrying (on 429) |
Error Codes
| Code | Type | Description |
|---|---|---|
| 400 | bad_request | Invalid request parameters |
| 401 | unauthorized | Missing or invalid API key |
| 403 | forbidden | Access denied (trading terms not accepted, account suspended, or permission error) |
| 404 | not_found | Model or resource not found |
| 429 | rate_limit_exceeded | Too many requests. Retry after X-RateLimit-Reset. |
| 429 | insufficient_quota | Insufficient credits. Response includes top_up_url. |
| 500 | internal_error | Server error. Retry with exponential backoff. |
| 502 | upstream_error | Upstream provider error |
| 503 | market_unavailable | No eligible market seller available. Retry or check seller supply. |
Error Response Format
{
"error": {
"message": "Insufficient credits",
"type": "insufficient_quota",
"code": 429,
"top_up_url": "https://dragonfly-api.com/dashboard/billing"
}
}