Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt

Use this file to discover all available pages before exploring further.

During RL training the policy updates step by step, and the inference deployment needs those updated weights to generate the next batch of rollouts. The cookbook wires this as a shared GCS bucket:
  • The trainer writes a fresh checkpoint to the bucket after each optimizer step (or on a configurable cadence).
  • The deployment watches the same bucket and swaps in new weights without a pod restart.
Terminology. The internal Fireworks name for this mechanism is hotload. You’ll see that name in SDK field names (hot_load_trainer_job, hot_load_deployment_id, hot_load_bucket_url), low-level helper names (WeightSyncer.save_and_hotload), and server error messages. “Weight sync” and “hotload” refer to the same thing.

Normal flow

The RL recipe provisions the trainer and deployment for you — set deployment=DeployConfig(...) on the recipe Config and the SDK-managed service client wires the bucket correctly. With the default DeployConfig(weight_sync_scope=WeightSyncScope.PER_TRAINER), the trainer is requested first and the deployment is linked to the trainer-owned bucket. WeightSyncScope.PER_DEPLOYMENT reverses that order: the deployment is created first, then trainers write to the deployment-owned bucket. If you misconfigure the pairing, the server rejects the CreateDeployment or CreateRlorTrainerJob call up front with an error that links back here.

WeightSyncScope: who owns the bucket

DeployConfig.weight_sync_scope controls which resource must be created first:
ScopeBucket ownerUse when
PER_TRAINER (default)Trainer — one bucket per runSingle run, or one trainer feeding multiple deployments (sampler + held-out eval)
PER_DEPLOYMENTDeployment — stable bucket across trainer runsLong-lived deployment, many sequential trainers, can’t tolerate deployment restarts between runs
The recipe dispatches on this single field and wires the rest correctly. The two scopes are mutually exclusive for the same trainer ↔ deployment pair — don’t mix them.

Diagnosing errors

The control plane catches scope-mix mistakes at create time and returns an error that names both resources and suggests the fix. For the full list of server error strings and per-error recovery steps, see the cookbook’s dev skill: skills/dev/references/rl/hotload.md. It also covers trainer retention, the unified promote API, and runtime bucket-mismatch warnings.

See also