If you are using Fireworks-managed RLOR trainers with
FW_HOSTED, the ledger
and checkpoint-swap behavior here still matter, but you can usually ignore the
external-bucket setup and manual upload/signaling details from the BYOT
integration guide.Inspect snapshot history
Dump the ledger, sorted by most recent snapshot first:identity you signaled, whether it was a full or delta snapshot, the per-replica readiness transition timestamps, and any load error.
Inspect deployment status and failures
If the deployment itself is unhealthy (crashlooping after a bad snapshot, out-of-memory on merge, etc.), the reason is on the deployment resource itself:status, latestStatus.reason, and the most recent ledger entry together to reason about whether the problem is load-side, weights-side, or infra-side.
Reset the ledger
If the delta chain is wedged or you want to force the deployment back to the base model, you can clear server-side ledger history. This preserves the deployment itself; it just forgets every hot-loaded snapshot.Checkpoint-swap behavior
When you signal a new snapshot, Fireworks has to eventually swap weights on every replica. What happens to in-flight and new requests during the swap depends on which transition mode the deployment is configured with.Both modes behave the same way for checkpoint download — it always starts immediately after the signal, in parallel with ongoing inference. The modes differ in how they handle the actual weight-swap moment.Which mode you get is preconfigured on the deployment template and is not yet surfaced through
firectl. Ask your Fireworks contact if you need to change it.Async transition (recommended, default for RL)
This mode is similar in spirit to PipelineRL:- In-flight requests: paused for the duration of the swap, then resumed on the same HTTP connection. The active turn keeps its current KV state, so the request continues streaming instead of restarting.
- New requests: queued until the swap finishes. Clients observe this as elevated time-to-first-token (TTFT).
- No 4xx or 5xx is returned for the swap itself.
Synchronous transition
- In-flight requests: the server waits for them to complete on the old weights before swapping.
- New requests arriving during the swap are rejected with HTTP
425 Too Early. Your rollout client should back off and retry, ideally using the same session-affinity key so it lands on a replica that has already finished the swap.
Prompt cache reset behavior
reset_prompt_cache only affects what can be reused after the swap. It does not interrupt the active turn above.
all(current default): after the swap, later requests refill prompt cache broadly.new_session: existing session IDs keep their current cache namespace, while new session IDs refill.none: preserve prompt-cache state across the swap.
reset_prompt_cache in the POST /hot_load/v1/models/hot_load request body, for example { "identity": "version_002", "reset_prompt_cache": "new_session" }.
Need help?
If the ledger stops advancing, a snapshot never becomes ready, or the deployment stays unhealthy after you fall back to a full snapshot, contact Fireworks. Include the account ID, deployment ID, snapshot identity you tried to load, and the latest ledger output.Related pages
Full integration guide
The end-to-end BYOT flow.
Inference for RL rollouts
Session affinity and MoE Router Replay used alongside the swap behavior above.