Latency guarantees

Q: Is latency guaranteed for serverless models?

Currently there are no latency or availability guarantees for serverless models, however they are coming soon and we recommend contacting sales to discuss any specific needs or requirements you have.


Service level agreements

Q: Are there any SLAs for serverless models?

Our multi-tenant serverless offering does not currently come with Service Level Agreements (SLAs). However they are coming and we’d love to understand what your use case is in order to ensure you have the best experience possible on the Fireworks platform. Reach out to us via sales or our Discord community.


Quota information

Q: Are there any quotas for serverless?

For serverless deployments, quotas are as follows:

  • Developer accounts: 600 requests per minute (RPM)
  • Enterprise accounts: 600 requests per minute (RPM)
  • Quotas apply across all models and cannot be exceeded within the serverless infrastructure

For higher quotas:

  • Consider switching to on-demand deployments
  • Contact enterprise sales for custom solutions
  • Evaluate dedicated infrastructure options for greater flexibility

Additional resources