2025-05-20

What’s new

Diarization and batch processing support added to audio inference. See blog

2025-05-19

What’s new

🚀 Easier & faster LoRA fine-tune deployments on Fireworks

You can now deploy a LoRA fine-tune with a single command and get speeds that approximately match the base model:

firectl create deployment "accounts/fireworks/models/<MODEL_ID of lora model>"

Earlier, this involved two distinct steps, and the resultant deployment was slower than the base model:

  1. Create a deployment using firectl create deployment "accounts/fireworks/models/<MODEL_ID of base model>" --enable-addons
  2. Then deploy the addon to the deployment: firectl load-lora <MODEL_ID> --deployment <DEPLOYMENT_ID>

Docs: https://docs.fireworks.ai/models/deploying#deploying-to-on-demand

This change is for dedicated deployments with a single LoRA. You can still deploy multiple LoRAs on a deployment or deploy LoRA(s) on some Serverless models as described in the docs.