What this is
Sampler checkpoints are for deployment/hotload; train-state checkpoints are for reliable resume and continuation.Workflow
- Save sampler checkpoints at stable intervals.
- Hotload deployment with candidate checkpoint.
- Persist optimizer state for resumable runs.