Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Path Parameters
The Account Id
Query Parameters
ID of the reinforcement fine-tuning job, a random UUID will be generated if not specified.
Body
The name of the dataset used for training.
The evaluator resource name to use for RLOR fine-tuning job.
The name of a separate dataset to use for evaluation.
Whether to auto-carve the dataset for eval.
JobState represents the state an asynchronous job can be in.
JOB_STATE_UNSPECIFIED, JOB_STATE_CREATING, JOB_STATE_RUNNING, JOB_STATE_COMPLETED, JOB_STATE_FAILED, JOB_STATE_CANCELLED, JOB_STATE_DELETING, JOB_STATE_WRITING_RESULTS, JOB_STATE_VALIDATING, JOB_STATE_DELETING_CLEANING_UP, JOB_STATE_PENDING, JOB_STATE_EXPIRED, JOB_STATE_RE_QUEUEING, JOB_STATE_CREATING_INPUT_DATASET, JOB_STATE_IDLE Common training configurations.
The Weights & Biases team/user account for logging training progress.
The output dataset's aggregated stats for the evaluation job.
BIJ parameters.
Response
A successful response.
The name of the dataset used for training.
The evaluator resource name to use for RLOR fine-tuning job.
The completed time for the reinforcement fine-tuning job.
The name of a separate dataset to use for evaluation.
Whether to auto-carve the dataset for eval.
JobState represents the state an asynchronous job can be in.
JOB_STATE_UNSPECIFIED, JOB_STATE_CREATING, JOB_STATE_RUNNING, JOB_STATE_COMPLETED, JOB_STATE_FAILED, JOB_STATE_CANCELLED, JOB_STATE_DELETING, JOB_STATE_WRITING_RESULTS, JOB_STATE_VALIDATING, JOB_STATE_DELETING_CLEANING_UP, JOB_STATE_PENDING, JOB_STATE_EXPIRED, JOB_STATE_RE_QUEUEING, JOB_STATE_CREATING_INPUT_DATASET, JOB_STATE_IDLE The email address of the user who initiated this fine-tuning job.
Common training configurations.
The Weights & Biases team/user account for logging training progress.
The output dataset's aggregated stats for the evaluation job.
BIJ parameters.