For serverless deployments, quotas are as follows:
  • Self-serve accounts: 6,000 requests per minute (RPM)
  • Quotas apply across all models and cannot be exceeded within the serverless infrastructure
For higher quotas:
  • Consider switching to on-demand deployments
  • Contact enterprise sales for custom solutions
  • Evaluate dedicated infrastructure options for greater flexibility