Service levels

Latency guarantees

Q: Is latency guaranteed for serverless models? Currently there are no latency or availability guarantees for serverless models, however they are coming soon and we recommend contacting sales to discuss any specific needs or requirements you have.

Service level agreements

Q: Are there any SLAs for serverless models? Our multi-tenant serverless offering does not currently come with Service Level Agreements (SLAs). However they are coming and we’d love to understand what your use case is in order to ensure you have the best experience possible on the Fireworks platform. Reach out to us via sales or our Discord community.

Quota information

Q: Are there any quotas for serverless? For serverless deployments, request-rate limits and TPM limits work together. Standard serverless, Priority tier, and Fast all follow the same public serverless rate-limit policy, and your effective limits depend on payment method, traffic patterns, and spend tier. See Serverless rate limits for the full TPM policy. For higher quotas:

Review Serverless rate limits
Consider switching to on-demand deployments
Contact sales for higher starting limits, higher TPM upper bounds, or higher request-rate needs
Evaluate dedicated infrastructure options for greater flexibility

Additional resources

Discord Community: discord.gg/fireworks-ai
Email Support: inquiries@fireworks.ai
Documentation: Fireworks.ai docs

Reference

Examples

FAQ

Latency guarantees

Service level agreements

Quota information

Additional resources

​Latency guarantees

​Service level agreements

​Quota information

​Additional resources

Latency guarantees

Service level agreements

Quota information

Additional resources