Skip to main content
POST
/
v1
/
accounts
/
{account_id}
/
rlorTrainerJobs
Create Reinforcement Fine-tuning Step
curl --request POST \
  --url https://api.fireworks.ai/v1/accounts/{account_id}/rlorTrainerJobs \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "displayName": "<string>",
  "dataset": "<string>",
  "evaluationDataset": "<string>",
  "evalAutoCarveout": true,
  "trainingConfig": {
    "outputModel": "<string>",
    "baseModel": "<string>",
    "warmStartFrom": "<string>",
    "jinjaTemplate": "<string>",
    "learningRate": 123,
    "maxContextLength": 123,
    "loraRank": 123,
    "region": "REGION_UNSPECIFIED",
    "epochs": 123,
    "batchSize": 123,
    "gradientAccumulationSteps": 123,
    "learningRateWarmupSteps": 123
  },
  "rewardWeights": [
    "<string>"
  ],
  "wandbConfig": {
    "enabled": true,
    "apiKey": "<string>",
    "project": "<string>",
    "entity": "<string>",
    "runId": "<string>"
  },
  "keepAlive": true,
  "rolloutDeploymentName": "<string>",
  "lossConfig": {
    "method": "METHOD_UNSPECIFIED",
    "klBeta": 123
  }
}
'
{
  "name": "<string>",
  "displayName": "<string>",
  "createTime": "2023-11-07T05:31:56Z",
  "completedTime": "2023-11-07T05:31:56Z",
  "dataset": "<string>",
  "evaluationDataset": "<string>",
  "evalAutoCarveout": true,
  "state": "JOB_STATE_UNSPECIFIED",
  "status": {
    "code": "OK",
    "message": "<string>"
  },
  "createdBy": "<string>",
  "trainingConfig": {
    "outputModel": "<string>",
    "baseModel": "<string>",
    "warmStartFrom": "<string>",
    "jinjaTemplate": "<string>",
    "learningRate": 123,
    "maxContextLength": 123,
    "loraRank": 123,
    "region": "REGION_UNSPECIFIED",
    "epochs": 123,
    "batchSize": 123,
    "gradientAccumulationSteps": 123,
    "learningRateWarmupSteps": 123
  },
  "rewardWeights": [
    "<string>"
  ],
  "wandbConfig": {
    "enabled": true,
    "apiKey": "<string>",
    "project": "<string>",
    "entity": "<string>",
    "runId": "<string>",
    "url": "<string>"
  },
  "keepAlive": true,
  "rolloutDeploymentName": "<string>",
  "lossConfig": {
    "method": "METHOD_UNSPECIFIED",
    "klBeta": 123
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication using your Fireworks API key. Format: Bearer <API_KEY>

Path Parameters

account_id
string
required

The Account Id

Query Parameters

rlorTrainerJobId
string

ID of the RLOR trainer job, a random UUID will be generated if not specified.

Body

application/json
displayName
string
dataset
string

The name of the dataset used for training.

evaluationDataset
string

The name of a separate dataset to use for evaluation.

evalAutoCarveout
boolean

Whether to auto-carve the dataset for eval.

trainingConfig
BaseTrainingConfig contains common configuration fields shared across different training job types. Next ID: 19 · object

Common training configurations.

rewardWeights
string[]

A list of reward metrics to use for training in format of "<reward_name>=".

wandbConfig
object

The Weights & Biases team/user account for logging training progress.

keepAlive
boolean
rolloutDeploymentName
string

Rollout deployment name associated with this RLOR trainer job. This is optional. If not set, trainer will not trigger weight sync to rollout engine.

lossConfig
object

Reinforcement learning loss method + hyperparameters for the underlying trainer.

Response

200 - application/json

A successful response.

name
string
displayName
string
createTime
string<date-time>
completedTime
string<date-time>
dataset
string

The name of the dataset used for training.

evaluationDataset
string

The name of a separate dataset to use for evaluation.

evalAutoCarveout
boolean

Whether to auto-carve the dataset for eval.

state
enum<string>
default:JOB_STATE_UNSPECIFIED

JobState represents the state an asynchronous job can be in.

  • JOB_STATE_PAUSED: Job is paused, typically due to account suspension or manual intervention.
Available options:
JOB_STATE_UNSPECIFIED,
JOB_STATE_CREATING,
JOB_STATE_RUNNING,
JOB_STATE_COMPLETED,
JOB_STATE_FAILED,
JOB_STATE_CANCELLED,
JOB_STATE_DELETING,
JOB_STATE_WRITING_RESULTS,
JOB_STATE_VALIDATING,
JOB_STATE_DELETING_CLEANING_UP,
JOB_STATE_PENDING,
JOB_STATE_EXPIRED,
JOB_STATE_RE_QUEUEING,
JOB_STATE_CREATING_INPUT_DATASET,
JOB_STATE_IDLE,
JOB_STATE_CANCELLING,
JOB_STATE_EARLY_STOPPED,
JOB_STATE_PAUSED
status
Mimics [https://github.com/googleapis/googleapis/blob/master/google/rpc/status.proto] · object
createdBy
string

The email address of the user who initiated this fine-tuning job.

trainingConfig
BaseTrainingConfig contains common configuration fields shared across different training job types. Next ID: 19 · object

Common training configurations.

rewardWeights
string[]

A list of reward metrics to use for training in format of "<reward_name>=".

wandbConfig
object

The Weights & Biases team/user account for logging training progress.

keepAlive
boolean
rolloutDeploymentName
string

Rollout deployment name associated with this RLOR trainer job. This is optional. If not set, trainer will not trigger weight sync to rollout engine.

lossConfig
object

Reinforcement learning loss method + hyperparameters for the underlying trainer.