Quotas are usage limits placed on your account. Most quotas are not configurable.

Default quotas

By default, the following quotas are in place:

Quota NameDefault ValueCan be raised?
Serverless inference RPM600No
# of deployed models100Yes
# A100 GPUs8Yes
# H100 GPUs8Yes
Monthly spend USD$50Yes

Viewing quotas

You can view your current quota capacity by running:

firectl list quotas

Raising quotas

Number of deployed models

This is available for enterprise accounts. Contact the Fireworks team at inquiries@fireworks.ai.

GPU quotas

GPU quotas are in place to limit the number of on-demand GPUs you can run. In order to raise your GPU quotas, you must purchase a reservation. Contact the Fireworks team at inquiries@fireworks.ai to purchase a reservation or to learn more.

Monthly spend

In order to prevent fraud, Fireworks imposes a monthly spending limit on your account. Once you hit the spending limit, your account will automatically enter a suspended state and all Fireworks usage will be stopped. This incldues serverless inference, dedicated deployments, and fine-tuning jobs.

Your spending limit will organically increase over time as you spend more on the platform. See the following table:

TierSpending LimitQualification
Tier 1$50/moValid payment method added
Tier 2$500/moTotal historical spend of $100+
Tier 3$5,000/moTotal historical spend of $1,000+
Tier 4$50,000/moTotal historical spend of $10,000+
UnlimitedUnlimitedContact us at inquiries@fireworks.ai

You can purchase prepaid credits in order to move into the next tier.

Credits are counted against your spending limit, so it is possible to hit the spending limit before all of your current credits are depleted.