Chat Completions

Create a chat completion. OpenAI-compatible.

POST/chat/completions
ParameterTypeRequiredDescription
modelstringYesModel ID (e.g. deepseek/deepseek-chat)
messagesarrayYesArray of message objects with role and content
temperaturenumberNoSampling temperature (0-2). Default: 1
top_pnumberNoNucleus sampling. Default: 1
max_tokensintegerNoMaximum tokens to generate
streambooleanNoStream response via SSE. Default: false
stopstring|arrayNoStop sequences
frequency_penaltynumberNoFrequency penalty (-2 to 2). Default: 0
presence_penaltynumberNoPresence penalty (-2 to 2). Default: 0
toolsarrayNoFunction calling tools (OpenAI format)
tool_choicestring|objectNoTool selection mode
curl https://dragonfly-api.com/v1/chat/completions \
  -H "Authorization: Bearer sk-df-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-chat",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is 2+2?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'
Response
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1709000000,
  "model": "deepseek/deepseek-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "2 + 2 = 4"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 7,
    "total_tokens": 25,
    "cost_usd": 0.000022
  }
}

Streaming

Set stream: true to receive responses as Server-Sent Events (SSE). Each chunk is a JSON object prefixed with data:. The stream ends with data: [DONE].

curl https://dragonfly-api.com/v1/chat/completions \
  -H "Authorization: Bearer sk-df-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-chat",
    "stream": true,
    "messages": [{"role": "user", "content": "Write a haiku"}]
  }'

Tool Calling

Pass an array of tool definitions. The model returns tool_calls in the response when it decides to invoke a tool. Compatible with the OpenAI function calling format.

ParameterTypeRequiredDescription
toolsarrayNoArray of tool definitions. Each tool has type: "function" and a function object.
tool_choicestring|objectNo"auto" (default), "none", or {"type":"function","function":{"name":"..."}}
function.namestringYesFunction name (no spaces, letters/digits/underscores only)
function.descriptionstringNoDescription of when to call this function
function.parametersobjectNoJSON Schema object describing the function parameters
Response
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"Shanghai\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Embeddings

Generate vector embeddings from text. Compatible with the OpenAI embeddings API format.

POST/embeddings
ParameterTypeRequiredDescription
modelstringYesEmbedding model ID (e.g. text-embedding-3-small)
inputstring|arrayYesString or array of strings to embed
encoding_formatstringNo"float" (default) or "base64"
dimensionsintegerNoReduce output dimensions (model-dependent)

Market Routing

Dragonfly operates a market where third-party providers contribute API keys at discounted prices.

ParameterTypeRequiredDescription
Routing-StrategyheaderNoUse marketplace seller capacity only. Fails with 503 if no eligible seller is available. Requires accepted Trading Terms.
curl https://dragonfly-api.com/v1/chat/completions \
  -H "Authorization: Bearer sk-df-xxx" \
  -H "Content-Type: application/json" \
  -H "Routing-Strategy: market-only" \
  -d '{
    "model": "deepseek/deepseek-chat",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
ParameterTypeRequiredDescription
X-Max-PriceheaderNoMax acceptable input-side price (hard filter).
X-Max-TTFT-MsheaderNoPreferred max TTFT (soft preference).
X-Min-Throughput-TPSheaderNoPreferred minimum throughput (soft preference).
X-Hard-Max-TTFT-MsheaderNoStrict max TTFT filter.
X-Hard-Min-Throughput-TPSheaderNoStrict minimum throughput filter.
X-Privacy-ModeheaderNoPrivacy filter: trusted_only.
X-Provider-OnlyheaderNoComma-separated provider allowlist (hard filter).
X-Provider-IgnoreheaderNoComma-separated provider denylist (hard filter).
X-Provider-OrderheaderNoComma-separated provider preference order (soft ranking).
X-DF-Market-Routeresponse headerNoseller.
X-DF-Market-Providerresponse headerNoMatched provider slug.
X-DF-Market-Listing-Idresponse headerNoMatched listing ID (seller route).
X-DF-Market-Request-Idresponse headerNoMatching request ID.

List Models

List all available models.

GET/models
curl https://dragonfly-api.com/v1/models

Audio Speech (TTS)

Generate speech from text. Powered by MiniMax.

POST/audio/speech
ParameterTypeRequiredDescription
modelstringYesminimax/speech-02-hd or minimax/speech-02-turbo
inputstringYesText to speak
voicestringNoVoice ID. Default: alloy
speednumberNoSpeed (0.25-4.0). Default: 1.0

Models & Pricing

Prices per million tokens.

DeepSeek

Model IDInputOutputContextCapabilities
deepseek/deepseek-chat$0.50$1.50128KChat, Code
deepseek/deepseek-reasoner$2.00$8.00128KReasoning

MiniMax

Model IDInputOutputContextCapabilities
minimax/minimax-m2.5$0.80$2.401MChat
minimax/minimax-m2.5-highspeed$0.40$1.201MChat (fast)
minimax/speech-02-hdTTS (HD)
minimax/speech-02-turboTTS (fast)

Rate Limits

Rate limits apply per API key. When exceeded, the response is 429 with type rate_limit_exceeded or insufficient_quota. Retry using the Retry-After header.

HeaderMeaning
X-RateLimit-LimitRequest limit for the current window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the window resets
Retry-AfterSeconds to wait before retrying (on 429)

Error Codes

CodeTypeDescription
400bad_requestInvalid request parameters
401unauthorizedMissing or invalid API key
403forbiddenAccess denied (trading terms not accepted, account suspended, or permission error)
404not_foundModel or resource not found
429rate_limit_exceededToo many requests. Retry after X-RateLimit-Reset.
429insufficient_quotaInsufficient credits. Response includes top_up_url.
500internal_errorServer error. Retry with exponential backoff.
502upstream_errorUpstream provider error
503market_unavailableNo eligible market seller available. Retry or check seller supply.

Error Response Format

Response
{
  "error": {
    "message": "Insufficient credits",
    "type": "insufficient_quota",
    "code": 429,
    "top_up_url": "https://dragonfly-api.com/dashboard/billing"
  }
}