Documentation Index
Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt
Use this file to discover all available pages before exploring further.
If you are using Fireworks-managed RLOR trainers with
FW_HOSTED, the ledger
and checkpoint-swap behavior here still matter, but you can usually ignore the
external-bucket setup and manual upload/signaling details from the BYOT
integration guide.Inspect snapshot history
Dump the ledger, sorted by most recent snapshot first:identity you signaled, whether it was a full or delta snapshot, the per-replica readiness transition timestamps, and any load error.
Inspect deployment status and failures
If the deployment itself is unhealthy (crashlooping after a bad snapshot, out-of-memory on merge, etc.), the reason is on the deployment resource itself:status, latestStatus.reason, and the most recent ledger entry together to reason about whether the problem is load-side, weights-side, or infra-side.
Snapshot config validation errors
Weight sync validates each snapshot’sconfig.json against the deployment’s base-model config before serving the snapshot. A validation failure means the snapshot stayed unloaded; continue serving the previous ready snapshot or fall back to a new full snapshot after fixing the files.
Common messages include:
Extra base model config optionsorExtra snapshot model config options: one config has a top-level field that the other does not.Config value mismatch for <field>: both configs contain the field, but the values differ.Types mismatch: the snapshot config resolves to a different HuggingFace config class than the base model.
validation.extra_fields_ignore, for example:
Reset the ledger
If the delta chain is wedged or you want to force the deployment back to the base model, you can clear server-side ledger history. This preserves the deployment itself; it just forgets every hot-loaded snapshot.Checkpoint-swap behavior
When you signal a new snapshot, Fireworks has to eventually swap weights on every replica. What happens to in-flight and new requests during the swap depends on which transition mode the deployment is configured with.Both modes behave the same way for checkpoint download — it always starts immediately after the signal, in parallel with ongoing inference. The modes differ in how they handle the actual weight-swap moment.Set the mode at deployment create time with
--hot-load-transition-type ASYNC or SYNC (default ASYNC). See Create a hot-load deployment.Async transition (recommended, default for RL)
This mode is similar in spirit to PipelineRL:- In-flight requests: paused for the duration of the swap, then resumed on the same HTTP connection. The active turn keeps its current KV state, so the request continues streaming instead of restarting.
- New requests: queued until the swap finishes. Clients observe this as elevated time-to-first-token (TTFT).
- No 4xx or 5xx is returned for the swap itself. Users may specify
x-fireworks-hot-load-drain-timeouttimeout request header in seconds (default90) to receive HTTP 425 Too Early once the timeout expires.
Synchronous transition
- In-flight requests: the server waits for them to complete on the old weights before swapping.
- New requests arriving during the swap are rejected with HTTP
425 Too Early. Your rollout client should back off and retry, ideally using the same session-affinity key so it lands on a replica that has already finished the swap.
Prompt cache reset behavior
reset_prompt_cache only affects what can be reused after the swap. It does not interrupt the active turn (the in-flight HTTP stream), but it affects the next turn in the same session and new sessions.
Configure per snapshot in POST /hot_load/v1/models/hot_load, for example { "identity": "version_002", "reset_prompt_cache": "new_session" }.
reset_prompt_cache | Existing turn (same HTTP stream) | New turn, same x-multi-turn-session-id | New session (new session id) |
|---|---|---|---|
all (default) | Async: continues with prior KV on the stream. Sync: waits for turn to finish before swap. | Recompute KV | Recompute KV |
new_session | Continues | Reuse KV for that session id | Recompute KV |
none | Continues | Reuse KV | Reuse KV |
Need help?
If the ledger stops advancing, a snapshot never becomes ready, or the deployment stays unhealthy after you fall back to a full snapshot, contact Fireworks. Include the account ID, deployment ID, snapshot identity you tried to load, and the latest ledger output.Related pages
Quickstart (BYOT)
Prerequisites, deployment setup, and the hot-load API.
Incremental snapshots
ARC2 deltas, hints, and incremental signal bodies.
Inference for RL rollouts
Session affinity, policy version in streams, and MoE Router Replay.