Choose by Use Case
| Category | Use Case | Recommended Models |
|---|---|---|
| Code & Development | Code generation, reasoning & agentic tasks | DeepSeek V4 Pro, Kimi K2.6, GLM 5.1, MiniMax M2.7 |
| AI Applications | AI agents with tool use | Kimi K2.6, DeepSeek V4 Pro, GLM 5.1, MiniMax M2.7 |
| General reasoning & planning | DeepSeek V4 Pro, Kimi K2.6, GLM 5.1, GPT-OSS 120B (medium) | |
| Long context & summarization | DeepSeek V4 Pro, Kimi K2.6, Qwen3.6 Plus, GLM 5.1, DeepSeek V4 Flash | |
| Fast extraction, classification & search | DeepSeek V4 Flash, MiniMax M2.5, Kimi K2.5, Step 3.7 Flash, GPT-OSS 20B (small) | |
| Vision & Multimodal | Vision & document understanding | Kimi K2.6, Qwen3.6 Plus, Step 3.7 Flash, Gemma 4 31B (small) |
| Audio & video understanding | Qwen3 Omni 30B A3B Instruct, NVIDIA Nemotron 3 Nano Omni 30B A3B | |
| Search & Retrieval | Embeddings & reranking | Qwen3 Embedding 8B, Qwen3 Reranker 8B |
Migrating from Closed Models?
If you’re currently using Claude, OpenAI / GPT, or Gemini models, here’s a guide to the best open source alternatives on Fireworks by use case and latency requirements.Claude Alternatives
| Closed Source | Use Case | Latency Budget | Open Source Alternative |
|---|---|---|---|
| Claude Opus 4.8 / Sonnet 4.6 | • Agentic use cases • Coding • Research agents | High | • DeepSeek V4 Pro • Kimi K2.6 • GLM 5.1 • MiniMax M2.7 • Qwen3.6 Plus |
| Claude Haiku 4.5 | • Agentic use cases • Coding • Research agents | Low | • Step 3.7 Flash • DeepSeek V4 Flash • MiniMax M2.5 • GPT-OSS 20B (small) |
OpenAI GPT Alternatives
| Closed Source | Use Case | Latency Budget | Open Source Alternative |
|---|---|---|---|
| GPT-5.5 / GPT-5.5 Pro | • Agentic use cases • Research agents | High | • DeepSeek V4 Pro • Kimi K2.6 • GLM 5.1 • MiniMax M2.7 |
| GPT-5.4 mini & nano | • Chatbots • Intent classification • Search | Low | • Step 3.7 Flash • DeepSeek V4 Flash • MiniMax M2.5 • GPT-OSS 20B (small) |
Google Gemini Alternatives
| Closed Source | Use Case | Latency Budget | Open Source Alternative |
|---|---|---|---|
| Gemini 3.1 Pro | • Agentic use cases • Research agents • Multimodal | High | • DeepSeek V4 Pro • Kimi K2.6 • GLM 5.1 • Qwen3.6 Plus • MiniMax M2.7 |
| Gemini 3.5 Flash & 3.1 Flash-Lite | • Chatbots • Intent classification • Search | Low | • Step 3.7 Flash • DeepSeek V4 Flash • MiniMax M2.5 • GPT-OSS 20B (small) |
- High latency budget: Quality is priority. Best for complex reasoning, multi-step workflows, and research tasks where accuracy matters more than speed.
- Low latency budget: Speed is priority. Best for user-facing applications like chatbots, real-time search, and high-throughput classification.
Last updated: June 2026