> ## Documentation Index
> Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# What factors affect the number of simultaneous requests that can be handled?

Request handling capacity depends on several factors:

* **Model size and type**
* **Number of GPUs allocated** to the deployment
* **GPU type** (e.g., A100, H100)
* **Prompt size**
* **Generation token length**
* **Deployment type** (serverless vs. on-demand)
