Skip to main content

What this is

Service-mode RLOR jobs expose trainer endpoints consumed by custom training loops.

Workflow

  1. Create job with serviceMode=true.
  2. Wait for readiness and capture direct_route_handle.
  3. Resume and delete jobs as experiments evolve.

End-to-end examples

Create and inspect RLOR job

job = fw.reinforcement_fine_tuning_steps.create(
    training_config={"base_model": "accounts/fireworks/models/qwen3-8b", "lora_rank": 0},
    extra_body={"serviceMode": True, "keepAlive": False},
)
job_id = job.name.split("/")[-1]
status = fw.reinforcement_fine_tuning_steps.get(rlor_trainer_job_id=job_id)

Operational guidance

  • Service-mode trainer jobs currently support full-parameter tuning only. Set lora_rank=0 when serviceMode=true (lora_rank>0 is rejected).