Managed Training Jobs

What this is

Managed jobs let the platform handle the training lifecycle — scheduling, execution, checkpointing, and output model materialization. Use them when your objective fits a supported method and your priority is reliability and operational simplicity.

Managed job types

Resource	Objective	API
Supervised Fine-Tuning (SFT)	Cross-entropy on instruction/response pairs	`fw.supervised_fine_tuning_jobs.*`
DPO	Direct preference optimization on chosen/rejected pairs	`fw.dpo_jobs.*`
Managed RFT	Reinforcement fine-tuning with built-in RL losses	`fw.reinforcement_fine_tuning_jobs.*`
Service-mode RLOR	Custom objectives via Training SDK or Cookbook loops	`TrainerJobManager` + `TrainerJobConfig` (SDK) or cookbook recipes

Listing jobs

from fireworks import Fireworks

fw = Fireworks(api_key="<FIREWORKS_API_KEY>", account_id="<ACCOUNT_ID>")

# List all job types
sft_jobs = fw.supervised_fine_tuning_jobs.list()
rft_jobs = fw.reinforcement_fine_tuning_jobs.list()
dpo_jobs = fw.dpo_jobs.list()

for job in sft_jobs:
    print(f"SFT: {job.display_name} — {job.state}")

Creating managed jobs

SFT (flat keyword arguments)

job = fw.supervised_fine_tuning_jobs.create(
    dataset="accounts/<ACCOUNT_ID>/datasets/my-dataset",
    base_model="accounts/fireworks/models/qwen3-8b",
    learning_rate=2e-5,
    epochs=3,
    max_context_length=4096,
    lora_rank=16,
    wandb_config={"entity": "my-team", "project": "sft"},
)

DPO (flat keyword arguments)

job = fw.dpo_jobs.create(
    dataset="accounts/<ACCOUNT_ID>/datasets/preference-data",
    base_model="accounts/fireworks/models/qwen3-8b",
    learning_rate=1e-5,
    max_context_length=4096,
)

Managed RFT (with training_config and loss_config)

job = fw.reinforcement_fine_tuning_jobs.create(
    dataset="accounts/<ACCOUNT_ID>/datasets/rl-data",
    training_config={
        "base_model": "accounts/fireworks/models/qwen3-8b",
        "max_context_length": 4096,
        "learning_rate": 1e-5,
    },
    loss_config={
        "method": "GRPO",
        "kl_beta": 0.01,
    },
)

Available `loss_config` methods

Method	Description
`GRPO`	Group Relative Policy Optimization (default for RFT)
`DAPO`	Dynamic Advantage Policy Optimization
`DPO`	Direct Preference Optimization (default for DPO API)
`ORPO`	Odds Ratio Preference Optimization
`GSPO_TOKEN`	Token-level GSPO

W&B integration

All managed job types support wandb_config for native Weights & Biases logging:

wandb_config = {
    "entity": "my-team",
    "project": "training-experiment",
}

# Works with SFT, DPO, and RFT jobs
fw.supervised_fine_tuning_jobs.create(
    ...,
    wandb_config=wandb_config,
)

When to switch to service-mode loops

Move from managed jobs to service-mode RLOR loops when you need:

Custom loss functions (e.g. hybrid GRPO + DPO, custom reward shaping)
Full-parameter tuning with per-step metrics
Inference-in-the-loop evaluation via hotloading during training
Algorithm research beyond the built-in methods

For custom service-mode loops, prefer fireworks.training.sdk.TrainerJobManager + TrainerJobConfig (see Training SDK Overview).

Operational guidance

Use managed jobs when your objective fits supported methods and you want minimal code.
Monitor job state by polling fw.<job_type>.get(...) until the job reaches a terminal state.
Cancel stuck jobs with fw.<job_type>.cancel(...) to release resources.
Delete completed jobs when you no longer need them.

API Reference

Inference

Training SDK

Deployments

Fine-tuning

Evals

Multimedia

Admin

What this is

Managed job types

Listing jobs

Creating managed jobs

SFT (flat keyword arguments)

DPO (flat keyword arguments)

Managed RFT (with training_config and loss_config)

Available `loss_config` methods

W&B integration

When to switch to service-mode loops

Operational guidance

API Reference

Inference

Training SDK

Deployments

Fine-tuning

Evals

Multimedia

Admin

​What this is

​Managed job types

​Listing jobs

​Creating managed jobs

​SFT (flat keyword arguments)

​DPO (flat keyword arguments)

​Managed RFT (with training_config and loss_config)

​Available loss_config methods

​W&B integration

​When to switch to service-mode loops

​Operational guidance

​Related Guides

What this is

Managed job types

Listing jobs

Creating managed jobs

SFT (flat keyword arguments)

DPO (flat keyword arguments)

Managed RFT (with training_config and loss_config)

Available `loss_config` methods

W&B integration

When to switch to service-mode loops

Operational guidance

Related Guides