Resume Rlor Trainer Job
Authorizations
Bearer authentication using your Fireworks API key. Format: Bearer <API_KEY>
Path Parameters
The Account Id
The Rlor Trainer Job Id
Body
The body is of type object.
Response
A successful response.
The name of the dataset used for training.
The name of a separate dataset to use for evaluation.
Whether to auto-carve the dataset for eval.
JobState represents the state an asynchronous job can be in.
- JOB_STATE_PAUSED: Job is paused, typically due to account suspension or manual intervention.
- JOB_STATE_DELETED: Job has been deleted.
JOB_STATE_UNSPECIFIED, JOB_STATE_CREATING, JOB_STATE_RUNNING, JOB_STATE_COMPLETED, JOB_STATE_FAILED, JOB_STATE_CANCELLED, JOB_STATE_DELETING, JOB_STATE_WRITING_RESULTS, JOB_STATE_VALIDATING, JOB_STATE_DELETING_CLEANING_UP, JOB_STATE_PENDING, JOB_STATE_EXPIRED, JOB_STATE_RE_QUEUEING, JOB_STATE_CREATING_INPUT_DATASET, JOB_STATE_IDLE, JOB_STATE_CANCELLING, JOB_STATE_EARLY_STOPPED, JOB_STATE_PAUSED, JOB_STATE_DELETED The email address of the user who initiated this fine-tuning job.
Common training configurations.
A list of reward metrics to use for training in format of "<reward_name>=".
The Weights & Biases team/user account for logging training progress.
The AWS configuration for S3 dataset access.
The Azure configuration for Azure Blob Storage dataset access.
Job progress.
Rollout deployment name associated with this RLOR trainer job. This is optional. If not set, trainer will not trigger weight sync to rollout engine.
Reinforcement learning loss method + hyperparameters for the underlying trainer.
The number of nodes to use for the fine-tuning job. If not specified, the default is 1.
Accelerator seconds used by the job, keyed by accelerator type (e.g., "NVIDIA_H100_80GB"). Updated periodically.
The deployment ID used for hot loading. When set, checkpoints are saved to this deployment's hot load bucket, enabling weight swaps on inference. Only valid for service-mode or keep-alive jobs.
Scheduling purpose for this job.
PURPOSE_UNSPECIFIED, PURPOSE_PILOT When true, run the trainer in forward-only mode (no backward/optimizer). Used for reference models in GRPO that only need forward passes.
For managed service use only. Users do not need to set this field.
Trainer inactivity timeout. The trainer reports tracked activity, including trainer API operations and active-session heartbeats. If no tracked activity is observed for this duration, the trainer is automatically stopped. When unset or 0, defaults to 60 minutes. Set disableInactivityCleanup to true to disable automatic cleanup. GPU usage continues to accrue while the trainer is running.
Disable trainer inactivity cleanup. When true, the trainer is not automatically stopped due to inactivity. GPU usage continues to accrue while the trainer is running.