FW_HOSTED, start from Tinker API Compatibility & Full Parameter Tuning instead. Fireworks manages the bucket plumbing in that path.
Architecture
You own: trainer, reward shaping, checkpoint cadence, rollout orchestration. Fireworks owns: hot-load logistics, distributed weight swap, inference serving, KV cache across rollouts.End-to-end loop
- Create a hot-load deployment.
- Upload and hot-load an initial full snapshot.
- Run rollouts against that snapshot.
- Upload and hot-load the next incremental snapshot.
- Run rollouts again.
- Every 20th or 30th step, publish another full snapshot instead of an incremental one. Otherwise, repeat from step 4.
1. Create a hot-load deployment
Create the deployment that will serve rollouts. The--enable-hot-load family of flags is currently hidden during preview, so you may need to pass them explicitly.
--deployment-shapeis optional. If omitted,firectlwill prompt you to pick one interactively. ---hot-load-bucket-typecurrently acceptsMINIO,S3,NEBIUS, orFW_HOSTED. -FW_HOSTEDis the Fireworks-managed trainer path. This guide focuses on external-bucket BYOT integrations. ---hot-load-bucket-urlis required for external-bucket flows when--enable-hot-loadis set. Format examples:s3://mybucket/path,gs://mybucket/path. No trailing slash. ---regionpicks where the deployment runs (for exampleUS_OHIO_1,US_VIRGINIA_1). Keep the trainer and bucket geographically close for upload speed.

2. Upload and hot-load an initial full snapshot
For the first step, upload a full HuggingFace-format checkpoint and then signal Fireworks to load it.Snapshot layout
Place each snapshot under its own subdirectory keyed by an opaquecheckpoint_id:
checkpoint_idis any string you pick (for exampleversion_001orstep_00100).- The checkpoint must look like the base model on HuggingFace:
config.json, tokenizer, and safetensors weights. - Split weights into multiple safetensors files, each under about 5 GB.
Signal the snapshot is ready
Once all files for the snapshot are uploaded, signal Fireworks to begin loading:Wait until replicas are ready
Poll the system state until every replica reports readiness on the new snapshot:- every replica has
readiness: true, and - every replica’s
current_snapshot_identityequals theidentityyou just signaled.
3. Run rollouts
Once replicas are ready, call the regular OpenAI-compatible inference API. For RL rollouts you’ll usually want session-affinity headers so multi-turn trajectories reuse KV cache on the same replica:4. Upload and hot-load incremental snapshots
For most intermediate training steps, publish an incremental snapshot against the currently loaded snapshot instead of another full snapshot. Fireworks supports the public ARC2 format (arc_v2) for this flow.
Upload the next snapshot under a new checkpoint_id, then signal it with incremental_snapshot_metadata:
readiness: true and current_snapshot_identity == "version_002".
5. Repeat the loop
- Use a new full snapshot for the first step and then every 20th or 30th step after that.
- Use an incremental snapshot for the intermediate steps.
- If an incremental hot-load fails or the chain gets into a bad state, fall back to a new full snapshot.
- If you need lower-level recovery steps, see Ledger & debugging for RL rollouts.
Next steps
Ledger & debugging
Inspect snapshot history, reset the ledger, and reason about request behavior during weight swaps.
Inference for RL rollouts
Session affinity headers, behavior during weight swap, and MoE Router Replay
(R3).
Fireworks-hosted trainer
The alternative path where Fireworks runs the trainer via Tinker-compatible SDK.