Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt

Use this file to discover all available pages before exploring further.

Fireworks Serverless offers three serving paths:
  • Standard is the default serving path. No service_tier parameter is needed.
  • Priority tier is for workloads that require higher reliability during peak traffic.
  • Fast is for workloads that require higher speeds.

Priority tier

Priority tier is for workloads that require higher reliability during peak traffic periods, at a higher price point. Priority tier is prioritized above Standard traffic and is less likely to be load shed (503 server overloaded). To use priority tier, set service_tier to "priority". Supported on OpenAI-compatible chat completions and on the Anthropic-compatible messages API:
curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p5",
    "service_tier": "priority",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
Priority tier is available on select models. Models and pricing are listed on the Serverless pricing page.

Fast

Fast is a high-speed serving path, useful for interactive applications that require fast response speeds, at a higher price point. Fast variants aim for 100+ tokens per second of generated throughput. It is not a different model and the quality of the model remains the same. Fast is available for select models. To use Fast, change the model ID as listed below.
Modelmodel ID
Kimi K2.6 Fastaccounts/fireworks/routers/kimi-k2p6-fast
GLM 5.1 Fastaccounts/fireworks/routers/glm-5p1-fast
curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/routers/kimi-k2p6-fast",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
Pricing is listed on the Serverless pricing page.