What this is
Managed SFT jobs are the shortest path for supervised adaptation when you don’t need a custom per-step objective loop. The platform handles the training lifecycle — you just provide data and configuration.For SFT with a custom training loop, use
training.recipes.sft_loop from the cookbook.When to use managed SFT vs. custom loops
| Use managed SFT when… | Use custom loops when… |
|---|---|
| Standard cross-entropy objective is sufficient | You need custom loss functions (GRPO, DPO variants) |
| You want minimal code | You want per-step control and metrics |
| No need for inference-in-the-loop evaluation | You need hotload + sampling during training |
Workflow
- Create and upload dataset to the Fireworks platform.
- Launch a supervised fine-tuning job with your training configuration.
- Monitor the job until it completes.
- Deploy the resulting model.
Step 1: Create and upload dataset
Dataset format
Each line in the JSONL file should contain a conversation in the standard messages format:Step 2: Launch SFT job
The SFT create API uses flat keyword arguments (not a nestedtraining_config dict):
SFT create parameters
| Parameter | Type | Description |
|---|---|---|
dataset | str | Required. Dataset resource name |
base_model | str | Base model to fine-tune |
learning_rate | float | Learning rate |
max_context_length | int | Maximum sequence length |
epochs | int | Number of training epochs |
lora_rank | int | LoRA rank (omit for full fine-tuning) |
batch_size | int | Max packed tokens per batch |
batch_size_samples | int | Number of samples per gradient batch |
gradient_accumulation_steps | int | Gradient accumulation steps |
learning_rate_warmup_steps | int | Linear warmup steps |
optimizer_weight_decay | float | L2 regularization |
early_stop | bool | Stop early if validation loss plateaus |
eval_auto_carveout | bool | Auto-split data for evaluation |
evaluation_dataset | str | Separate eval dataset resource name |
output_model | str | Model ID for the output (defaults to job ID) |
display_name | str | Human-readable job name |
nodes | int | Number of training nodes |
jinja_template | str | Custom prompt template |
wandb_config | dict | W&B logging config |
warm_start_from | str | Resume from a PEFT addon model |
region | str | Training region |
Step 3: Monitor the job
Step 4: Deploy the resulting model
Operational guidance
- SFT managed jobs optimize a supervised cross-entropy objective — no custom loss code required.
- Use a held-out evaluation set and evaluate before promoting a trained model to production.
- LoRA is supported for SFT managed jobs — use
lora_rank=16or32for parameter-efficient tuning. - If you need custom objective functions, move to service-mode Training SDK loops instead (see Custom Train Step). Service mode supports both full-parameter and LoRA tuning.
- W&B integration: Pass
wandb_config={"entity": "my-team", "project": "sft-exp"}to enable logging.