Choose by Use Case
| Category | Use Case | Recommended Models |
|---|---|---|
| Code & Development | Code generation & reasoning | Kimi K2 0905, Deepseek V3.1, GLM 4.6 (Large) Qwen2.5-32B-Coder (Medium) Qwen3 Coder 30B A3B (Small) |
| Code completion & bug fixing | Qwen3 235B A22B, Qwen2.5-32B-Coder (Medium) Qwen3 Coder 30B A3B, Qwen3 14B, Qwen3 8B (Small) | |
| AI Applications | AI Agents with tool use | Kimi K2 0905, Deepseek V3.1, Qwen3 235B A22B, GLM 4.6 (Large) Qwen 3 Family Models (Large/Medium/Small) |
| General reasoning & planning | Kimi K2 0905, Kimi K2 Thinking, Deepseek V3.1, Qwen3 235B Thinking 2507, GLM 4.6 (Large) GPT-OSS-120B, Qwen2.5-72B-Instruct, Llama 3.3 70B (Medium) | |
| Long context & summarization | Kimi K2 0905 (Large) GPT-OSS-120B (Medium) | |
| Fast semantic search & extraction | GPT-OSS-120B (Medium) GPT-OSS 20B, Qwen3 8B, Qwen 3 4B, Llama 3.1 8B, Llama 3.2 3B, Llama 3.2 1B (Small) | |
| Vision & Multimodal | Vision & document understanding | Qwen3 VL 235B A22B, Qwen2.5-VL 72B Instruct, Qwen2.5-VL 32B Instruct (Medium) Deepseek OCR, Qwen3 VL 30B A3B, Qwen2.5-VL 3-7B (Small) |
Migrating from Closed Models?
If you’re currently using Claude, OpenAI / GPT, or Gemini models, here’s a guide to the best open source alternatives on Fireworks by use case and latency requirements.Claude Alternatives
| Closed Source | Use Case | Latency Budget | Open Source Alternative |
|---|---|---|---|
| Claude Sonnet 4.5 | • Agentic use cases • Coding • Research agents | High | • Deepseek V3.1 • Kimi K2 0905 • GLM 4.6 |
| Claude Haiku 4.5 | • Agentic use cases • Coding • Research agents | Low | • Qwen 3 Coder 30B • Qwen 3 14B • Mistral Codestral 22B |
OpenAI GPT Alternatives
| Closed Source | Use Case | Latency Budget | Open Source Alternative |
|---|---|---|---|
| GPT-5 | • Agentic use cases • Research agents | High | • Kimi K2 0905 • Qwen 3 235B |
| GPT-5 mini & nano | • Chatbots • Intent classification • Search | Low | • Qwen 3 14B and 8B • GPT-OSS 120B and 20B |
Google Gemini Alternatives
| Closed Source | Use Case | Latency Budget | Open Source Alternative |
|---|---|---|---|
| Gemini 2.5 Pro | • Agentic use cases • Research agents | High | • Kimi K2 Thinking • Qwen 3 235B |
| Gemini 2.5 Pro Flash & Flash Light | • Chatbots • Intent classification • Search | Low | • Qwen 3 4B and 8B • Llama 3.1 8B • GPT-OSS 20B |
- High latency budget: Quality is priority. Best for complex reasoning, multi-step workflows, and research tasks where accuracy matters more than speed.
- Low latency budget: Speed is priority. Best for user-facing applications like chatbots, real-time search, and high-throughput classification.
Last updated: November 2025