> ## Documentation Index
> Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Serverless Priority and Fast

> Serverless Priority Tier and Fast Mode on Fireworks

<Tip>
  Priority tier and Fast mode are in Preview. The features, pricing, and availability may change - we welcome your feedback!
</Tip>

Fireworks offers a Priority tier for workloads that require higher reliability, as well as a Fast mode for workloads that require higher speeds.

## Priority tier

Priority tier is for workloads that require higher reliability during peak traffic periods, at a higher price point. Priority tier is prioritized above Standard traffic and is less likely to be load shed (503 server overloaded).

To use priority tier, set `service_tier` to `"priority"`. Supported on OpenAI-compatible chat completions and on the [Anthropic-compatible](/tools-sdks/anthropic-compatibility) `messages` API:

```bash theme={null}
curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p5",
    "service_tier": "priority",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

Priority tier is available on select models. Models and pricing are listed on the [Serverless pricing](/serverless/pricing) page.

## Fast mode

Fast mode is a high-speed configuration, useful for interactive applications that require fast response speeds, at a higher price point. Fast variants aim for **100+ tokens per second** of generated throughput. It is not a different model and the quality of the model remains the same.

Fast mode is available for select models. To use Fast mode, change the `model` id as listed below.

| Model           | `model` id                                   |
| --------------- | -------------------------------------------- |
| Kimi K2.6 Turbo | `accounts/fireworks/routers/kimi-k2p6-turbo` |
| GLM 5.1 Fast    | `accounts/fireworks/routers/glm-5p1-fast`    |

```bash theme={null}
curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p6-turbo",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

Pricing is listed on the [Serverless pricing](/serverless/pricing) page.

## Related

* [Serverless overview](/serverless/overview)
* [Serverless quickstart](/getting-started/quickstart)
* [Text models](/guides/querying-text-models)
* [Anthropic compatibility](/tools-sdks/anthropic-compatibility) — `service_tier` is supported on both OpenAI-compatible chat completions and the Anthropic `messages` API.
