Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt

Use this file to discover all available pages before exploring further.

Early Access Feature. This page is part of the same private-preview external-bucket hot-load workflow for RL rollouts. Contact Fireworks to enable this path on your account before using non-FW_HOSTED storage.
Start with the linear workflow in RL Rollouts with Your Own Trainer if you have not completed a first full snapshot and rollout yet.
Use incremental snapshots between full snapshots to reduce upload size and weight-update time. Each incremental snapshot is a compressed delta against a previous snapshot identity already loaded on the deployment. Fireworks supports the public ARC2 format (compression_format: "arc_v2") with Adler32 checksums (checksum_format: "alder32").

Snapshot cadence

WhenSnapshot typeNotes
First training stepFullHuggingFace layout under a new identity
Every 20th–30th stepFullResets the chain; faster recovery if a delta is corrupt
All other stepsIncrementalprevious_snapshot_identity must match the snapshot currently served
If an incremental hot-load fails or the chain is wedged, publish a new full snapshot and see Ledger & debugging.

Why incremental?

  • Smaller uploads — Typical compression ratios exceed 20× versus re-uploading full weights.
  • Faster loads — Less data over the network; merge applies on replicas that already hold the previous snapshot.
  • Chain dependency — Each incremental snapshot must reference the correct previous_snapshot_identity (the last successfully loaded snapshot).

Create ARC2 deltas

You need a pair of consecutive full checkpoints on disk (or tensors in memory) and produce diff safetensors for the new step.

Compression library

Use the Fireworks delta compression utilities. A reference implementation is available in this GitHub gist (delta_compress_files_to_file, arc_v2, alder32). Per-file example (previous full snapshot version_001, new full snapshot version_002_full, upload diff as version_002):
from delta import delta_compress_files_to_file  # from the gist / your vendored copy

delta_compress_files_to_file(
    src="version_001/model-00000.safetensors",
    dst="version_002_full/model-00000.safetensors",
    diff_file="version_002/model-00000.safetensors",
    compression_format="arc_v2",
)
Repeat for each safetensors shard (same filenames as the base layout). Copy non-weight files (for example config.json, tokenizer) from the new full tree into version_002/ as needed.
If the previous checkpoint is already in trainer CPU memory, the gist also exposes tensor-level helpers (delta_compress_dicts, etc.) so you can avoid writing full intermediates to disk.
Upload only the incremental directory for the new identity (for example s3://.../version_002/). Do not re-upload the entire full checkpoint every step.

Upload workflow

  1. Build diffs with arc_v2 for each .safetensors file.
  2. Upload all files under the new identity prefix (same bucket parent as snapshot layout).
  3. Optionally call per-file hints as each file completes.
  4. Signal incremental ready via POST /hot_load.
  5. Poll GET /hot_load until all replicas are ready (same criteria as the integration guide).

Per-file hints (optional)

Hints let Fireworks start fetching and staging files before you signal the full snapshot. They are optional but recommended for large models. Endpoint: POST https://api.fireworks.ai/hot_load/v1/models/hot_load/hint Headers: Same as hot-load API (Authorization, fireworks-model, fireworks-deployment). Full snapshot hint:
{
  "snapshot": { "identity": "version_001" },
  "filename": "model-00000.safetensors"
}
Incremental snapshot hint:
{
  "snapshot": {
    "identity": "version_002",
    "incremental_snapshot_metadata": {
      "previous_snapshot_identity": "version_001",
      "compression_format": "arc_v2",
      "checksum_format": "alder32"
    }
  },
  "filename": "model-00000.safetensors"
}
curl -X POST https://api.fireworks.ai/hot_load/v1/models/hot_load/hint \
  -H "Authorization: Bearer <fireworks_api_key>" \
  -H "fireworks-model: accounts/<account_id>/models/<model_id>" \
  -H "fireworks-deployment: accounts/<account_id>/deployments/<deployment_id>" \
  -H "Content-Type: application/json" \
  -d '{
    "snapshot": {
      "identity": "version_002",
      "incremental_snapshot_metadata": {
        "previous_snapshot_identity": "version_001",
        "compression_format": "arc_v2",
        "checksum_format": "alder32"
      }
    },
    "filename": "model-00000.safetensors"
  }'

Signal incremental snapshot ready

After all files are uploaded, signal the deployment to load the incremental snapshot:
curl -X POST https://api.fireworks.ai/hot_load/v1/models/hot_load \
  -H "Authorization: Bearer <fireworks_api_key>" \
  -H "fireworks-model: accounts/<account_id>/models/<model_id>" \
  -H "fireworks-deployment: accounts/<account_id>/deployments/<deployment_id>" \
  -H "Content-Type: application/json" \
  -d '{
    "identity": "version_002",
    "incremental_snapshot_metadata": {
      "previous_snapshot_identity": "version_001",
      "compression_format": "arc_v2",
      "checksum_format": "alder32"
    },
    "reset_prompt_cache": "all"
  }'
incremental_snapshot_metadata.previous_snapshot_identity
string
required
The identity of the snapshot already loaded on the deployment (must exist in the ledger).
incremental_snapshot_metadata.compression_format
string
required
Use "arc_v2" for BYOT integrations.
incremental_snapshot_metadata.checksum_format
string
required
Use "alder32".
reset_prompt_cache
string
all (default), none, or new_session. See the prompt cache matrix.
Poll until every replica has readiness: true and current_snapshot_identity == "version_002".

Reference

  • Every snapshot needs a new identity (single directory name, no /).
  • Point previous_snapshot_identity at the snapshot the deployment is serving before this load.
  • Upload incremental diff safetensors under the new identity; keep periodic full snapshots for recovery.

Quickstart (BYOT)

Prerequisites, deployment setup, first full snapshot, and rollouts.

Ledger & debugging

Inspect snapshot history and recover from a broken chain.

Inference for RL rollouts

Session affinity, policy version, and MoE Router Replay.