Skip to main content

What this is

Managed SFT jobs are the shortest path for supervised adaptation when you don’t need a custom per-step objective loop. The platform handles the training lifecycle — you just provide data and configuration.
For SFT with a custom training loop, use training.recipes.sft_loop from the cookbook.

When to use managed SFT vs. custom loops

Use managed SFT when…Use custom loops when…
Standard cross-entropy objective is sufficientYou need custom loss functions (GRPO, DPO variants)
You want minimal codeYou want per-step control and metrics
No need for inference-in-the-loop evaluationYou need hotload + sampling during training

Workflow

  1. Create and upload dataset to the Fireworks platform.
  2. Launch a supervised fine-tuning job with your training configuration.
  3. Monitor the job until it completes.
  4. Deploy the resulting model.

Step 1: Create and upload dataset

from fireworks import Fireworks

fw = Fireworks(api_key="<FIREWORKS_API_KEY>", account_id="<ACCOUNT_ID>")

# Create dataset resource
dataset = fw.datasets.create(
    dataset_id="my-sft-dataset",
    dataset={"exampleCount": "12000"},
)

# Upload training data (JSONL format)
fw.datasets.upload(
    dataset_id="my-sft-dataset",
    file="/path/to/sft_data.jsonl",
)

# Validate the upload
fw.datasets.validate_upload(
    dataset_id="my-sft-dataset",
    body={},
)

Dataset format

Each line in the JSONL file should contain a conversation in the standard messages format:
{"messages": [{"role": "user", "content": "What is 2+2?"}, {"role": "assistant", "content": "4"}]}
{"messages": [{"role": "user", "content": "Translate hello to French"}, {"role": "assistant", "content": "Bonjour"}]}

Step 2: Launch SFT job

The SFT create API uses flat keyword arguments (not a nested training_config dict):
job = fw.supervised_fine_tuning_jobs.create(
    dataset="accounts/<ACCOUNT_ID>/datasets/my-sft-dataset",
    base_model="accounts/fireworks/models/qwen3-8b",
    max_context_length=4096,
    learning_rate=2e-5,
    epochs=3,
    lora_rank=16,
    gradient_accumulation_steps=4,
    display_name="my-sft-experiment",
)

print(job.name)   # accounts/<ACCOUNT_ID>/supervisedFineTuningJobs/<JOB_ID>
print(job.state)  # JOB_STATE_CREATING

SFT create parameters

ParameterTypeDescription
datasetstrRequired. Dataset resource name
base_modelstrBase model to fine-tune
learning_ratefloatLearning rate
max_context_lengthintMaximum sequence length
epochsintNumber of training epochs
lora_rankintLoRA rank (omit for full fine-tuning)
batch_sizeintMax packed tokens per batch
batch_size_samplesintNumber of samples per gradient batch
gradient_accumulation_stepsintGradient accumulation steps
learning_rate_warmup_stepsintLinear warmup steps
optimizer_weight_decayfloatL2 regularization
early_stopboolStop early if validation loss plateaus
eval_auto_carveoutboolAuto-split data for evaluation
evaluation_datasetstrSeparate eval dataset resource name
output_modelstrModel ID for the output (defaults to job ID)
display_namestrHuman-readable job name
nodesintNumber of training nodes
jinja_templatestrCustom prompt template
wandb_configdictW&B logging config
warm_start_fromstrResume from a PEFT addon model
regionstrTraining region

Step 3: Monitor the job

import time

job_id = job.name.split("/")[-1]
while True:
    status = fw.supervised_fine_tuning_jobs.get(supervised_fine_tuning_job_id=job_id)
    state = str(status.state)
    print(f"Job state: {state}")
    if state in ("JOB_STATE_COMPLETED", "JOB_STATE_FAILED", "JOB_STATE_CANCELLED"):
        break
    time.sleep(15)

if state != "JOB_STATE_COMPLETED":
    raise RuntimeError(f"SFT job failed with state={state}")

Step 4: Deploy the resulting model

fw.deployments.create(
    deployment_id="sft-serving",
    base_model=f"accounts/<ACCOUNT_ID>/models/{status.output_model or job_id}",
    min_replica_count=0,
    max_replica_count=1,
)

Operational guidance

  • SFT managed jobs optimize a supervised cross-entropy objective — no custom loss code required.
  • Use a held-out evaluation set and evaluate before promoting a trained model to production.
  • LoRA is supported for SFT managed jobs — use lora_rank=16 or 32 for parameter-efficient tuning.
  • If you need custom objective functions, move to service-mode Training SDK loops instead (see Custom Train Step). Service mode supports both full-parameter and LoRA tuning.
  • W&B integration: Pass wandb_config={"entity": "my-team", "project": "sft-exp"} to enable logging.