The Cookbook

What is the Cookbook?

The Fireworks Cookbook is a collection of training recipes and utilities built on top of the Training SDK. It provides config-driven training loops that handle trainer provisioning, data loading, tokenization, gradient accumulation, checkpointing, and cleanup automatically. The cookbook is optional — everything it does can be done with the SDK directly. Use the cookbook when you want a working training loop quickly; use the SDK when you need full control.

Installation

git clone https://github.com/fw-ai/cookbook.git
cd cookbook/training && pip install -e .

Set your credentials:

export FIREWORKS_API_KEY="your-api-key"
export FIREWORKS_ACCOUNT_ID="your-account-id"

Available recipes

Recipe	Module	Use case
GRPO / RL	`training.recipes.rl_loop`	On-policy and off-policy reinforcement learning with GRPO, importance sampling, DAPO, DRO, GSPO, and CISPO
DPO	`training.recipes.dpo_loop`	Direct preference optimization from chosen/rejected pairs
SFT	`training.recipes.sft_loop`	Supervised fine-tuning with cross-entropy loss
ORPO	`training.recipes.orpo_loop`	Odds ratio preference optimization

Each recipe follows the same pattern: import Config and main, set your config, and call main(cfg). All launch examples below use infra=InfraConfig(training_shape_id=...). For cookbook users, that training shape ID is usually the only shape-specific input you need to set. If you want field-level details about what a training shape controls and what stays configurable, see the SDK reference pages linked from Training Shapes.

Quick example: SFT

from training.recipes.sft_loop import Config, main
from training.utils import InfraConfig

cfg = Config(
    log_path="./sft_quickstart",
    base_model="accounts/fireworks/models/qwen3-8b",
    dataset="/path/to/training_data.jsonl",
    tokenizer_model="Qwen/Qwen3-8B",
    max_seq_len=4096,
    epochs=1,
    batch_size=4,
    infra=InfraConfig(
        training_shape_id="accounts/fireworks/trainingShapes/qwen3-8b-128k-h200",
    ),
)

main(cfg)

Quick example: GRPO

from training.recipes.rl_loop import Config, main
from training.utils import DeployConfig, InfraConfig, WeightSyncConfig

cfg = Config(
    log_path="./grpo_quickstart",
    base_model="accounts/fireworks/models/qwen3-8b",
    dataset="/path/to/prompts.jsonl",
    max_rows=100,
    infra=InfraConfig(
        training_shape_id="accounts/fireworks/trainingShapes/qwen3-8b-128k-h200",
    ),
    deployment=DeployConfig(
        deployment_id="grpo-serving",
        tokenizer_model="Qwen/Qwen3-8B",
    ),
    weight_sync=WeightSyncConfig(weight_sync_interval=1),
)

main(cfg)

W&B logging

All cookbook recipes accept a WandBConfig to stream metrics to Weights & Biases:

from training.utils import WandBConfig

cfg = Config(
    # ... same config as above ...
    wandb=WandBConfig(
        entity="my-team",
        project="grpo-experiment",
        run_name="qwen3-8b-sft-v1",  # optional, auto-generated if omitted
    ),
)

main(cfg)

Next steps

Cookbook RL (GRPO) — full GRPO walkthrough with reward functions
Cookbook DPO — preference optimization with pairwise data
Cookbook SFT — supervised fine-tuning
Cookbook Reference — all config classes and parameters

Get Started

Developer Pass

Deployments

Models & Inference

Fine Tuning

Administration

Security & Compliance

Integrations

What is the Cookbook?

Installation

Available recipes

Quick example: SFT

Quick example: GRPO

W&B logging

Next steps

Get Started

Developer Pass

Deployments

Models & Inference

Fine Tuning

Administration

Security & Compliance

Integrations

​What is the Cookbook?

​Installation

​Available recipes

​Quick example: SFT

​Quick example: GRPO

​W&B logging

​Next steps

What is the Cookbook?

Installation

Available recipes

Quick example: SFT

Quick example: GRPO

W&B logging

Next steps