curl --request GET \
--url https://api.fireworks.ai/v1/accounts/{account_id}/batchInferenceJobs/{batch_inference_job_id} \
--header 'Authorization: Bearer <token>'{
"name": "<string>",
"displayName": "<string>",
"createTime": "2023-11-07T05:31:56Z",
"createdBy": "<string>",
"state": "JOB_STATE_UNSPECIFIED",
"status": {
"code": "OK",
"message": "<string>"
},
"model": "<string>",
"inputDatasetId": "<string>",
"outputDatasetId": "<string>",
"inferenceParameters": {
"maxTokens": 123,
"temperature": 123,
"topP": 123,
"n": 123,
"extraBody": "<string>",
"topK": 123
},
"updateTime": "2023-11-07T05:31:56Z",
"precision": "PRECISION_UNSPECIFIED",
"jobProgress": {
"percent": 123,
"epoch": 123,
"totalInputRequests": 123,
"totalProcessedRequests": 123,
"successfullyProcessedRequests": 123,
"failedRequests": 123,
"outputRows": 123,
"inputTokens": 123,
"outputTokens": 123,
"cachedInputTokenCount": 123
},
"continuedFromJobName": "<string>"
}curl --request GET \
--url https://api.fireworks.ai/v1/accounts/{account_id}/batchInferenceJobs/{batch_inference_job_id} \
--header 'Authorization: Bearer <token>'{
"name": "<string>",
"displayName": "<string>",
"createTime": "2023-11-07T05:31:56Z",
"createdBy": "<string>",
"state": "JOB_STATE_UNSPECIFIED",
"status": {
"code": "OK",
"message": "<string>"
},
"model": "<string>",
"inputDatasetId": "<string>",
"outputDatasetId": "<string>",
"inferenceParameters": {
"maxTokens": 123,
"temperature": 123,
"topP": 123,
"n": 123,
"extraBody": "<string>",
"topK": 123
},
"updateTime": "2023-11-07T05:31:56Z",
"precision": "PRECISION_UNSPECIFIED",
"jobProgress": {
"percent": 123,
"epoch": 123,
"totalInputRequests": 123,
"totalProcessedRequests": 123,
"successfullyProcessedRequests": 123,
"failedRequests": 123,
"outputRows": 123,
"inputTokens": 123,
"outputTokens": 123,
"cachedInputTokenCount": 123
},
"continuedFromJobName": "<string>"
}Bearer authentication using your Fireworks API key. Format: Bearer <API_KEY>
The Account Id
The Batch Inference Job Id
The fields to be returned in the response. If empty or "*", all fields will be returned.
A successful response.
The creation time of the batch inference job.
The email address of the user who initiated this batch inference job.
JobState represents the state an asynchronous job can be in.
JOB_STATE_UNSPECIFIED, JOB_STATE_CREATING, JOB_STATE_RUNNING, JOB_STATE_COMPLETED, JOB_STATE_FAILED, JOB_STATE_CANCELLED, JOB_STATE_DELETING, JOB_STATE_WRITING_RESULTS, JOB_STATE_VALIDATING, JOB_STATE_DELETING_CLEANING_UP, JOB_STATE_PENDING, JOB_STATE_EXPIRED, JOB_STATE_RE_QUEUEING, JOB_STATE_CREATING_INPUT_DATASET, JOB_STATE_IDLE, JOB_STATE_CANCELLING, JOB_STATE_EARLY_STOPPED, JOB_STATE_PAUSED Show child attributes
The status code.
OK, CANCELLED, UNKNOWN, INVALID_ARGUMENT, DEADLINE_EXCEEDED, NOT_FOUND, ALREADY_EXISTS, PERMISSION_DENIED, UNAUTHENTICATED, RESOURCE_EXHAUSTED, FAILED_PRECONDITION, ABORTED, OUT_OF_RANGE, UNIMPLEMENTED, INTERNAL, UNAVAILABLE, DATA_LOSS A developer-facing error message in English.
The name of the model to use for inference. This is required, except when continued_from_job_name is specified.
The name of the dataset used for inference. This is required, except when continued_from_job_name is specified.
The name of the dataset used for storing the results. This will also contain the error file.
Parameters controlling the inference process.
Show child attributes
Maximum number of tokens to generate per response.
Sampling temperature, typically between 0 and 2.
Top-p sampling parameter, typically between 0 and 1.
Number of response candidates to generate per input.
Additional parameters for the inference request as a JSON string. For example: "{"stop": ["\n"]}".
Top-k sampling parameter, limits the token selection to the top k tokens.
The update time for the batch inference job.
The precision with which the model should be served. If PRECISION_UNSPECIFIED, a default will be chosen based on the model.
PRECISION_UNSPECIFIED, FP16, FP8, FP8_MM, FP8_AR, FP8_MM_KV_ATTN, FP8_KV, FP8_MM_V2, FP8_V2, FP8_MM_KV_ATTN_V2, NF4, FP4, BF16, FP4_BLOCKSCALED_MM, FP4_MX_MOE Job progress.
Show child attributes
Progress percent, within the range from 0 to 100.
The epoch for which the progress percent is reported, usually starting from 0. This is optional for jobs that don't run in an epoch fasion, e.g. BIJ, EVJ.
Total number of input requests/rows in the job.
Total number of requests that have been processed (successfully or failed).
Number of requests that were processed successfully.
Number of requests that failed to process.
Number of output rows generated.
Total number of input tokens processed.
Total number of output tokens generated.
The number of input tokens that hit the prompt cache.
The resource name of the batch inference job that this job continues from. Used for lineage tracking to understand job continuation chains.
Was this page helpful?