Skip to main content
POST
/
v1
/
accounts
/
{account_id}
/
reinforcementFineTuningJobs
Create Reinforcement Fine-tuning Job
curl --request POST \
  --url https://api.fireworks.ai/v1/accounts/{account_id}/reinforcementFineTuningJobs \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "displayName": "<string>",
  "dataset": "<string>",
  "evaluationDataset": "<string>",
  "evalAutoCarveout": true,
  "trainingConfig": {
    "outputModel": "<string>",
    "baseModel": "<string>",
    "warmStartFrom": "<string>",
    "jinjaTemplate": "<string>",
    "learningRate": 123,
    "maxContextLength": 123,
    "loraRank": 123,
    "acceleratorCount": 123,
    "region": "REGION_UNSPECIFIED",
    "epochs": 123,
    "batchSize": 123,
    "gradientAccumulationSteps": 123,
    "learningRateWarmupSteps": 123
  },
  "evaluator": "<string>",
  "wandbConfig": {
    "enabled": true,
    "apiKey": "<string>",
    "project": "<string>",
    "entity": "<string>",
    "runId": "<string>"
  },
  "outputStats": "<string>",
  "inferenceParameters": {
    "maxTokens": 123,
    "temperature": 123,
    "topP": 123,
    "n": 123,
    "extraBody": "<string>",
    "topK": 123
  },
  "outputMetrics": "<string>",
  "mcpServer": "<string>"
}'
{
  "name": "<string>",
  "displayName": "<string>",
  "createTime": "2023-11-07T05:31:56Z",
  "completedTime": "2023-11-07T05:31:56Z",
  "dataset": "<string>",
  "evaluationDataset": "<string>",
  "evalAutoCarveout": true,
  "state": "JOB_STATE_UNSPECIFIED",
  "status": {
    "code": "OK",
    "message": "<string>"
  },
  "createdBy": "<string>",
  "trainingConfig": {
    "outputModel": "<string>",
    "baseModel": "<string>",
    "warmStartFrom": "<string>",
    "jinjaTemplate": "<string>",
    "learningRate": 123,
    "maxContextLength": 123,
    "loraRank": 123,
    "acceleratorCount": 123,
    "region": "REGION_UNSPECIFIED",
    "epochs": 123,
    "batchSize": 123,
    "gradientAccumulationSteps": 123,
    "learningRateWarmupSteps": 123
  },
  "evaluator": "<string>",
  "wandbConfig": {
    "enabled": true,
    "apiKey": "<string>",
    "project": "<string>",
    "entity": "<string>",
    "runId": "<string>",
    "url": "<string>"
  },
  "outputStats": "<string>",
  "inferenceParameters": {
    "maxTokens": 123,
    "temperature": 123,
    "topP": 123,
    "n": 123,
    "extraBody": "<string>",
    "topK": 123
  },
  "outputMetrics": "<string>",
  "mcpServer": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

account_id
string
required

The Account Id

Query Parameters

reinforcementFineTuningJobId
string

ID of the reinforcement fine-tuning job, a random UUID will be generated if not specified.

Body

application/json
dataset
string
required

The name of the dataset used for training.

evaluator
string
required

The evaluator resource name to use for RLOR fine-tuning job.

displayName
string
evaluationDataset
string

The name of a separate dataset to use for evaluation.

evalAutoCarveout
boolean

Whether to auto-carve the dataset for eval.

state
enum<string>
default:JOB_STATE_UNSPECIFIED

JobState represents the state an asynchronous job can be in.

Available options:
JOB_STATE_UNSPECIFIED,
JOB_STATE_CREATING,
JOB_STATE_RUNNING,
JOB_STATE_COMPLETED,
JOB_STATE_FAILED,
JOB_STATE_CANCELLED,
JOB_STATE_DELETING,
JOB_STATE_WRITING_RESULTS,
JOB_STATE_VALIDATING,
JOB_STATE_DELETING_CLEANING_UP,
JOB_STATE_PENDING,
JOB_STATE_EXPIRED,
JOB_STATE_RE_QUEUEING,
JOB_STATE_CREATING_INPUT_DATASET,
JOB_STATE_IDLE
status
object
trainingConfig
object

Common training configurations.

wandbConfig
object

The Weights & Biases team/user account for logging training progress.

outputStats
string

The output dataset's aggregated stats for the evaluation job.

inferenceParameters
object

BIJ parameters.

outputMetrics
string
mcpServer
string

Response

200 - application/json

A successful response.

dataset
string
required

The name of the dataset used for training.

evaluator
string
required

The evaluator resource name to use for RLOR fine-tuning job.

name
string
displayName
string
createTime
string<date-time>
completedTime
string<date-time>

The completed time for the reinforcement fine-tuning job.

evaluationDataset
string

The name of a separate dataset to use for evaluation.

evalAutoCarveout
boolean

Whether to auto-carve the dataset for eval.

state
enum<string>
default:JOB_STATE_UNSPECIFIED

JobState represents the state an asynchronous job can be in.

Available options:
JOB_STATE_UNSPECIFIED,
JOB_STATE_CREATING,
JOB_STATE_RUNNING,
JOB_STATE_COMPLETED,
JOB_STATE_FAILED,
JOB_STATE_CANCELLED,
JOB_STATE_DELETING,
JOB_STATE_WRITING_RESULTS,
JOB_STATE_VALIDATING,
JOB_STATE_DELETING_CLEANING_UP,
JOB_STATE_PENDING,
JOB_STATE_EXPIRED,
JOB_STATE_RE_QUEUEING,
JOB_STATE_CREATING_INPUT_DATASET,
JOB_STATE_IDLE
status
object
createdBy
string

The email address of the user who initiated this fine-tuning job.

trainingConfig
object

Common training configurations.

wandbConfig
object

The Weights & Biases team/user account for logging training progress.

outputStats
string

The output dataset's aggregated stats for the evaluation job.

inferenceParameters
object

BIJ parameters.

outputMetrics
string
mcpServer
string
I