2025-06-24

🎯 Build SDK: Reward-kit integration for evaluator development

The Build SDK now natively integrates with reward-kit to simplify evaluator development for Reinforcement Fine-Tuning (RFT). You can now create custom evaluators in Python with automatic dependency management and seamless deployment to Fireworks infrastructure.

Key features:

  • Native reward-kit integration for evaluator development
  • Automatic packaging of dependencies from pyproject.toml or requirements.txt
  • Local testing capabilities before deployment
  • Direct integration with Fireworks datasets and evaluation jobs
  • Support for third-party libraries and complex evaluation logic

See our Developing Evaluators guide to get started with your first evaluator in minutes.

2025-06-24

Added new Responses API for advanced conversational workflows and integrations:

  • Continue conversations across multiple turns using the previous_response_id parameter to maintain context without resending full history.
  • Stream responses in real time as they are generated for responsive applications.
  • Control response storage with the store parameter—choose whether responses are retrievable by ID or ephemeral.

See the Response API guide for usage examples and details.

2025-05-20

What’s new

Diarization and batch processing support added to audio inference. See blog

2025-05-19

What’s new

🚀 Easier & faster LoRA fine-tune deployments on Fireworks

You can now deploy a LoRA fine-tune with a single command and get speeds that approximately match the base model:

firectl create deployment "accounts/fireworks/models/<MODEL_ID of lora model>"

Earlier, this involved two distinct steps, and the resultant deployment was slower than the base model:

  1. Create a deployment using firectl create deployment "accounts/fireworks/models/<MODEL_ID of base model>" --enable-addons
  2. Then deploy the addon to the deployment: firectl load-lora <MODEL_ID> --deployment <DEPLOYMENT_ID>

Docs: https://docs.fireworks.ai/models/deploying#deploying-to-on-demand

This change is for dedicated deployments with a single LoRA. You can still deploy multiple LoRAs on a deployment or deploy LoRA(s) on some Serverless models as described in the docs.