Skip to main content
During RL training the policy updates step by step, and the inference deployment needs those updated weights to generate the next batch of rollouts. The cookbook wires this as a shared GCS bucket:
  • The trainer writes a fresh checkpoint to the bucket after each optimizer step (or on a configurable cadence).
  • The deployment watches the same bucket and swaps in new weights without a pod restart.
Terminology. The internal Fireworks name for this mechanism is hotload. You’ll see that name in SDK field names (hot_load_trainer_job, hot_load_deployment_id, hot_load_bucket_url), methods (WeightSyncer.save_and_hotload), and server error messages. “Weight sync” and “hotload” refer to the same thing.

Normal flow

Use the cookbook’s setup_infra entrypoint — it creates the trainer, then creates the deployment pointing at it, with no extra wiring. The default DeployConfig(weight_sync_scope=WeightSyncScope.PER_TRAINER) is what you want for almost every run. If you misconfigure the pairing, the server rejects the CreateDeployment or CreateRlorTrainerJob call up front with an error that links back here.

WeightSyncScope: who owns the bucket

DeployConfig.weight_sync_scope controls which resource must be created first:
ScopeBucket ownerUse when
PER_TRAINER (default)Trainer — one bucket per runSingle run, or one trainer feeding multiple deployments (sampler + held-out eval)
PER_DEPLOYMENTDeployment — stable bucket across trainer runsLong-lived deployment, many sequential trainers, can’t tolerate deployment restarts between runs
setup_infra dispatches on this single field and wires the rest correctly. The two scopes are mutually exclusive for the same trainer ↔ deployment pair — don’t mix them.

Diagnosing errors

The control plane catches scope-mix mistakes at create time and returns an error that names both resources and suggests the fix. For the full list of server error strings and per-error recovery steps, see the cookbook’s dev skill: skills/dev/references/rl/hotload.md. It also covers trainer retention, the unified promote API, and runtime bucket-mismatch warnings.

See also