Skip to main content
Priority tier and Turbo mode are in Preview. The features, pricing, and availability may change - we welcome your feedback!
Fireworks offers a Priority tier for workloads that require higher reliability, as well as a Turbo mode for workloads that require higher speeds.

Priority tier

Priority tier is for workloads that require higher reliability during peak traffic periods, at a higher price point. Priority tier is prioritized above Standard traffic and is less likely to be rate limited. To use priority tier, set service_tier to "priority" (OpenAI-compatible chat completions only):
curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p5",
    "service_tier": "priority",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
Priority tier is available on select models. Models and pricing are listed on the Pricing page.

Turbo mode

Turbo mode is a high speed configuration, useful for interactive applications that require fast response speeds, at a higher price point. It is not a different model and the quality of the model remains the same. Turbo mode is available for select models. To use Turbo mode, change the model id as listed below.
Modelmodel id
Kimi K2.6 Turboaccounts/fireworks/routers/kimi-k2p6-turbo
curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p6-turbo",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
Pricing is listed on the Pricing page.