For serverless deployments, quotas are as follows:

  • Developer accounts: 600 requests per minute (RPM)
  • Enterprise accounts: 600 requests per minute (RPM)
  • Quotas apply across all models and cannot be exceeded within the serverless infrastructure

For higher quotas:

  • Consider switching to on-demand deployments
  • Contact enterprise sales for custom solutions
  • Evaluate dedicated infrastructure options for greater flexibility