Skip to main content
Looking for the right open source model? Whether you’re exploring by use case or migrating from closed source models like Claude, GPT, or Gemini, this guide provides recommendations based on Fireworks internal testing, customer deployments, and external benchmarks. We update it regularly as new models emerge.
Model sizes are marked as Small, Medium, or Large. For best quality, use large models or fine-tune medium/small models. For best speeds, use small models.

Choose by Use Case

CategoryUse CaseRecommended Models
Code & DevelopmentCode generation & reasoningKimi K2.5, Kimi K2 0905, Deepseek V3.2, GLM 4.7 (Large)
Qwen3 235B A22B, Qwen2.5-32B-Coder (Medium)
Code completion & bug fixingKimi K2.5, Kimi K2 0905 (Large)
Qwen3 235B A22B, Qwen2.5-32B-Coder (Medium)
Qwen3 14B, Qwen3 8B (Small)
AI ApplicationsAI Agents with tool useKimi K2.5, Kimi K2 0905, Deepseek V3.2, Qwen3 235B A22B, GLM 4.7 (Large)
Qwen 3 Family Models (Large/Medium/Small)
General reasoning & planningKimi K2.5, Kimi K2 0905, Kimi K2 Thinking, Deepseek V3.2, Qwen3 235B A22B, GLM 4.7 (Large)
GPT-OSS-120B, Qwen2.5-72B-Instruct, Llama 3.3 70B (Medium)
Long context & summarizationKimi K2.5, Kimi K2 0905 (Large)
GPT-OSS-120B (Medium)
Fast semantic search & extractionGPT-OSS-120B (Medium)
GPT-OSS 20B, Qwen3 8B, Qwen 3 4B, Llama 3.1 8B, Llama 3.2 3B, Llama 3.2 1B (Small)
Vision & MultimodalVision & document understandingKimi K2.5, Qwen2.5-VL 72B Instruct (Large)
Deepseek OCR, Qwen3 VL 30B A3B, Qwen2.5-VL 3-7B (Small)

Migrating from Closed Models?

If you’re currently using Claude, OpenAI / GPT, or Gemini models, here’s a guide to the best open source alternatives on Fireworks by use case and latency requirements.

Claude Alternatives

Closed SourceUse CaseLatency BudgetOpen Source Alternative
Claude Sonnet 4.5• Agentic use cases
• Coding
• Research agents
HighDeepseek V3.2
Kimi K2 0905
GLM 4.7
Claude Haiku 4.5• Agentic use cases
• Coding
• Research agents
LowQwen 3 14B
Qwen 3 8B
Mistral Codestral 22B

OpenAI GPT Alternatives

Closed SourceUse CaseLatency BudgetOpen Source Alternative
GPT-5• Agentic use cases
• Research agents
HighKimi K2 Thinking
Kimi K2 0905
Deepseek V3.2
GPT-5 mini & nano• Chatbots
• Intent classification
• Search
LowQwen 3 14B and 8B
GPT-OSS 120B and 20B

Google Gemini Alternatives

Closed SourceUse CaseLatency BudgetOpen Source Alternative
Gemini 3 Pro• Agentic use cases
• Research agents
HighKimi K2 Thinking
Kimi K2 0905
Deepseek V3.2
Gemini 3 Pro Flash & Flash Light• Chatbots
• Intent classification
• Search
LowQwen 3 4B and 8B
Llama 3.1 8B
GPT-OSS 20B
Understanding Latency Budget:
  • High latency budget: Quality is priority. Best for complex reasoning, multi-step workflows, and research tasks where accuracy matters more than speed.
  • Low latency budget: Speed is priority. Best for user-facing applications like chatbots, real-time search, and high-throughput classification.

You can explore all models in the Fireworks Model Library
Last updated: February 2026