Documentation Index
Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt
Use this file to discover all available pages before exploring further.
Start with the linear workflow in RL Rollouts with Your Own
Trainer if you have not completed a first
full snapshot and rollout yet.
compression_format: "arc_v2") with Adler32 checksums (checksum_format: "alder32").
Snapshot cadence
| When | Snapshot type | Notes |
|---|---|---|
| First training step | Full | HuggingFace layout under a new identity |
| Every 20th–30th step | Full | Resets the chain; faster recovery if a delta is corrupt |
| All other steps | Incremental | previous_snapshot_identity must match the snapshot currently served |
Why incremental?
- Smaller uploads — Typical compression ratios exceed 20× versus re-uploading full weights.
- Faster loads — Less data over the network; merge applies on replicas that already hold the previous snapshot.
- Chain dependency — Each incremental snapshot must reference the correct
previous_snapshot_identity(the last successfully loaded snapshot).
Create ARC2 deltas
You need a pair of consecutive full checkpoints on disk (or tensors in memory) and produce diff safetensors for the new step.Compression library
Use the Fireworks delta compression utilities. A reference implementation is available in this GitHub gist (delta_compress_files_to_file, arc_v2, alder32).
Per-file example (previous full snapshot version_001, new full snapshot version_002_full, upload diff as version_002):
config.json, tokenizer) from the new full tree into version_002/ as needed.
If the previous checkpoint is already in trainer CPU memory, the gist also exposes
tensor-level helpers (
delta_compress_dicts, etc.) so you can avoid writing full
intermediates to disk.identity (for example s3://.../version_002/). Do not re-upload the entire full checkpoint every step.
Upload workflow
- Build diffs with
arc_v2for each.safetensorsfile. - Upload all files under the new
identityprefix (same bucket parent as snapshot layout). - Optionally call per-file hints as each file completes.
- Signal incremental ready via
POST /hot_load. - Poll
GET /hot_loaduntil all replicas are ready (same criteria as the integration guide).
Per-file hints (optional)
Hints let Fireworks start fetching and staging files before you signal the full snapshot. They are optional but recommended for large models. Endpoint:POST https://api.fireworks.ai/hot_load/v1/models/hot_load/hint
Headers: Same as hot-load API (Authorization, fireworks-model, fireworks-deployment).
Full snapshot hint:
Signal incremental snapshot ready
After all files are uploaded, signal the deployment to load the incremental snapshot:The
identity of the snapshot already loaded on the deployment (must exist in the ledger).Use
"arc_v2" for BYOT integrations.Use
"alder32".all (default), none, or new_session. See the prompt cache matrix.readiness: true and current_snapshot_identity == "version_002".
Reference
- Every snapshot needs a new
identity(single directory name, no/). - Point
previous_snapshot_identityat the snapshot the deployment is serving before this load. - Upload incremental diff safetensors under the new
identity; keep periodic full snapshots for recovery.
Related pages
Quickstart (BYOT)
Prerequisites, deployment setup, first full snapshot, and rollouts.
Ledger & debugging
Inspect snapshot history and recover from a broken chain.
Inference for RL rollouts
Session affinity, policy version, and MoE Router Replay.