What is the Cookbook?
The Fireworks Cookbook is a collection of training recipes and utilities built on top of the Training SDK. It provides config-driven training loops that handle trainer provisioning, data loading, tokenization, gradient accumulation, checkpointing, and cleanup automatically. The cookbook is optional — everything it does can be done with the SDK directly. Use the cookbook when you want a working training loop quickly; use the SDK when you need full control.Installation
Available recipes
| Recipe | Module | Use case |
|---|---|---|
| GRPO / RL | training.recipes.rl_loop | On-policy and off-policy reinforcement learning with GRPO, importance sampling, DAPO, DRO, GSPO, and CISPO |
| DPO | training.recipes.dpo_loop | Direct preference optimization from chosen/rejected pairs |
| SFT | training.recipes.sft_loop | Supervised fine-tuning with cross-entropy loss |
| ORPO | training.recipes.orpo_loop | Odds ratio preference optimization |
Config and main, set your config, and call main(cfg).
All launch examples below use infra=InfraConfig(training_shape_id=...). For cookbook users, that training shape ID is usually the only shape-specific input you need to set.
If you want field-level details about what a training shape controls and what stays configurable, see the SDK reference pages linked from Training Shapes.
Quick example: SFT
Quick example: GRPO
W&B logging
All cookbook recipes accept aWandBConfig to stream metrics to Weights & Biases:
Next steps
- Cookbook RL (GRPO) — full GRPO walkthrough with reward functions
- Cookbook DPO — preference optimization with pairwise data
- Cookbook SFT — supervised fine-tuning
- Cookbook Reference — all config classes and parameters