One common question developers have is, which open models should I try for my use case? There is no single right answer! The list below is a starting point, based on our internal testing, external benchmarks, and community feedback, and we hope to update it regularly as new models come out. You can find these models (and more) in the Fireworks Model Library.

Note:

  • Models on Serverless are marked Serverless_, _other models are available On-demand
  • Model sizes are roughly indicated as Small / Medium / Large.

Last updated: April 28, 2025

Use case / capabilityRecommended models
Code planning, reasoning & generation Vibe-codingDeepSeek R1 & V3-0324 [Serverless][Large] are SOTA for general code tasks
Qwen2.5-32B-Coder [On-Demand][Medium]
Code fixing & completionQwen2.5-32B-Coder [On-Demand][Medium] and smaller models (0.5B, 3B, 7B, 14B) in the same family
DeepSeek V2.5 [On-Demand][Medium]
General planning, reasoning & understandingDeepSeek R1 & V3-0324 [Serverless][Large] are SOTA for general reasoning tasks
Qwen2.5-72B-Instruct [Serverless][Medium]
Llama 3.3 70B [Serverless][Medium]
Agentic tool use & function callingQwen2.5-72B-Instruct [Serverless][Medium] is best for function calling and agentic tool use
Long context & summarizationLlama 4 Maverick & Scout [Serverless][Medium/Large] have 1M context window
Vision / image understanding & document processingQwen2.5-32B-VL [Serverless][Medium], Qwen2.5-72B-VL [On-Demand][Medium] and other smaller models in the series (7B, 3B)
Llama 4 Maverick & Scout [Serverless][Medium/Large]
Query understanding &  entity extraction with very low latencyLlama 3.1 8B [Serverless][Small] and smaller models (Llama 3.2 3B, 3.2 1B)
Qwen 2.5 7B [On-Demand][Small] and smaller models (0.5B, 1.5B, 3B)