Skip to main content
POST
/
v1
/
accounts
/
{account_id}
/
batchInferenceJobs
Create Batch Inference Job
curl --request POST \
  --url https://api.fireworks.ai/v1/accounts/{account_id}/batchInferenceJobs \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "displayName": "<string>",
  "model": "<string>",
  "inputDatasetId": "<string>",
  "outputDatasetId": "<string>",
  "inferenceParameters": {
    "maxTokens": 123,
    "temperature": 123,
    "topP": 123,
    "n": 123,
    "extraBody": "<string>",
    "topK": 123
  },
  "precision": "PRECISION_UNSPECIFIED",
  "continuedFromJobName": "<string>"
}
'
{
  "name": "<string>",
  "displayName": "<string>",
  "createTime": "2023-11-07T05:31:56Z",
  "createdBy": "<string>",
  "state": "JOB_STATE_UNSPECIFIED",
  "status": {
    "code": "OK",
    "message": "<string>"
  },
  "model": "<string>",
  "inputDatasetId": "<string>",
  "outputDatasetId": "<string>",
  "inferenceParameters": {
    "maxTokens": 123,
    "temperature": 123,
    "topP": 123,
    "n": 123,
    "extraBody": "<string>",
    "topK": 123
  },
  "updateTime": "2023-11-07T05:31:56Z",
  "precision": "PRECISION_UNSPECIFIED",
  "jobProgress": {
    "percent": 123,
    "epoch": 123,
    "totalInputRequests": 123,
    "totalProcessedRequests": 123,
    "successfullyProcessedRequests": 123,
    "failedRequests": 123,
    "outputRows": 123,
    "inputTokens": 123,
    "outputTokens": 123,
    "cachedInputTokenCount": 123
  },
  "continuedFromJobName": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication using your Fireworks API key. Format: Bearer <API_KEY>

Path Parameters

account_id
string
required

The Account Id

Query Parameters

batchInferenceJobId
string

ID of the batch inference job.

Body

application/json
displayName
string
model
string

The name of the model to use for inference. This is required, except when continued_from_job_name is specified.

inputDatasetId
string

The name of the dataset used for inference. This is required, except when continued_from_job_name is specified.

outputDatasetId
string

The name of the dataset used for storing the results. This will also contain the error file.

inferenceParameters
BIJ inference parameters · object

Parameters controlling the inference process.

precision
enum<string>
default:PRECISION_UNSPECIFIED

The precision with which the model should be served. If PRECISION_UNSPECIFIED, a default will be chosen based on the model.

Available options:
PRECISION_UNSPECIFIED,
FP16,
FP8,
FP8_MM,
FP8_AR,
FP8_MM_KV_ATTN,
FP8_KV,
FP8_MM_V2,
FP8_V2,
FP8_MM_KV_ATTN_V2,
NF4,
FP4,
BF16,
FP4_BLOCKSCALED_MM,
FP4_MX_MOE
continuedFromJobName
string

The resource name of the batch inference job that this job continues from. Used for lineage tracking to understand job continuation chains.

Response

200 - application/json

A successful response.

name
string
displayName
string
createTime
string<date-time>

The creation time of the batch inference job.

createdBy
string

The email address of the user who initiated this batch inference job.

state
enum<string>
default:JOB_STATE_UNSPECIFIED

JobState represents the state an asynchronous job can be in.

Available options:
JOB_STATE_UNSPECIFIED,
JOB_STATE_CREATING,
JOB_STATE_RUNNING,
JOB_STATE_COMPLETED,
JOB_STATE_FAILED,
JOB_STATE_CANCELLED,
JOB_STATE_DELETING,
JOB_STATE_WRITING_RESULTS,
JOB_STATE_VALIDATING,
JOB_STATE_DELETING_CLEANING_UP,
JOB_STATE_PENDING,
JOB_STATE_EXPIRED,
JOB_STATE_RE_QUEUEING,
JOB_STATE_CREATING_INPUT_DATASET,
JOB_STATE_IDLE,
JOB_STATE_CANCELLING,
JOB_STATE_EARLY_STOPPED,
JOB_STATE_PAUSED
status
Mimics [https://github.com/googleapis/googleapis/blob/master/google/rpc/status.proto] · object
model
string

The name of the model to use for inference. This is required, except when continued_from_job_name is specified.

inputDatasetId
string

The name of the dataset used for inference. This is required, except when continued_from_job_name is specified.

outputDatasetId
string

The name of the dataset used for storing the results. This will also contain the error file.

inferenceParameters
BIJ inference parameters · object

Parameters controlling the inference process.

updateTime
string<date-time>

The update time for the batch inference job.

precision
enum<string>
default:PRECISION_UNSPECIFIED

The precision with which the model should be served. If PRECISION_UNSPECIFIED, a default will be chosen based on the model.

Available options:
PRECISION_UNSPECIFIED,
FP16,
FP8,
FP8_MM,
FP8_AR,
FP8_MM_KV_ATTN,
FP8_KV,
FP8_MM_V2,
FP8_V2,
FP8_MM_KV_ATTN_V2,
NF4,
FP4,
BF16,
FP4_BLOCKSCALED_MM,
FP4_MX_MOE
jobProgress
object

Job progress.

continuedFromJobName
string

The resource name of the batch inference job that this job continues from. Used for lineage tracking to understand job continuation chains.