Skip to main content
Fireworks AI is the fastest platform for building with open source AI models. Get production-ready inference and fine-tuning with best-in-class speed, cost and quality.

Get started in minutes

Start fast with Serverless

Use popular models instantly with pay-per-token pricing. Perfect for quality vibe testing and prototyping.

Deploy models & autoscale on dedicated GPUs

Deploy with high performance on dedicated GPUs with fast autoscaling and minimal cold starts. Optimize deployments for speed and throughput.

Fine-tune models for best quality

Boost model quality with supervised and reinforcement fine-tuning of models up to 1T+ parameters. Start training in minutes, deploy immediately.
Not sure where to start? First, pick the right model for your use case with our model selection guide. Then choose Serverless to prototype quickly, move to Deployments to optimize and run production workloads, or use Fine-tuning to improve quality.Need help optimizing deployments, fine-tuning models, or setting up production infrastructure? Talk to our team - we’ll help you get the best performance and reliability.

What you can build

100+ Supported Models

Text, vision, audio, image, and embeddings

Migrate from OpenAI

Drop-in replacement - just change the base URL

Function Calling

Connect models to tools and APIs

Structured Outputs

Reliable JSON responses for agentic workflows

Vision Models

Analyze images and documents

Speech to Text

Real-time or batch audio transcription

Embeddings & Reranking

Use embeddings & reranking in search & context retrieval

Batch Inference

Run async inference jobs at scale, faster and cheaper

Resources & help

Which model should I use?

Find the best model for your use case

Cookbook

Code examples and tutorials

API Reference

Complete API documentation

Discord Community

Ask questions and get help from developers

Security & Compliance

SOC 2, HIPAA, and audit reports

System Status

Check service uptime

Talk to Sales

Talk to our team