Why Fireworks Trainer

What this is
Why this approach
Workflow
Who this is for
Operational guidance
Related Guides

What this is

Research teams move faster when they can iterate on objective functions in plain Python and validate each checkpoint in production-like serving conditions. Fireworks Training SDK is built for this workflow.

Why this approach

Full-parameter updates maximize headroom for difficult reasoning and alignment tasks where LoRA may underfit.
Custom losses in Python eliminate waiting for vendor-specific algorithm implementations — implement GRPO, DPO, or any hybrid objective directly.
Serving-integrated evaluation via checkpoint hotloading avoids divergence between offline metrics and user-facing behavior.
One platform, two modes: Start with managed jobs for standard objectives, then move to service-mode loops when you need per-step control — without rebuilding infrastructure.

Workflow

Define objective and reward logic in your Python loop.
Run short controlled experiments with frequent checkpoints.
Hotload checkpoints into serving and evaluate with production-style prompts.
Promote only checkpoints that pass both offline and serving evaluations.

Who this is for

Research teams doing alignment, RLHF, and reasoning improvement with custom reward functions.
ML engineers who want to iterate on training algorithms without managing GPU clusters.
Teams transitioning from managed fine-tuning to custom training loops as their requirements grow.

Operational guidance

Treat train-state checkpoints, sampler checkpoints, and deployment revisions as a single experiment bundle.
Run small regression suites on every hotload candidate before promoting.
Version your objective functions alongside training data for reproducibility.

Core Concepts

Training and Sampling Workflow

⌘I

Get Started

Developer Pass

Deployments

Models & Inference

Fine Tuning

Administration

Security & Compliance

Integrations

What this is

Why this approach

Workflow

Who this is for

Operational guidance

Get Started

Developer Pass

Deployments

Models & Inference

Fine Tuning

Administration

Security & Compliance

Integrations

​What this is

​Why this approach

​Workflow

​Who this is for

​Operational guidance

​Related Guides

What this is

Why this approach

Workflow

Who this is for

Operational guidance

Related Guides