API Documentation
Embeddings API
Image API
Audio Batch API
Accounts
Deployments
Models
Supervised Fine Tuning Jobs
Datasets
Deployments
Create Deployment
POST
https://api.fireworks.ai
/
v1
/
accounts
/
{account_id}
/
deployments
curl --request POST \
--url https://api.fireworks.ai/v1/accounts/{account_id}/deployments \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"displayName": "<string>",
"description": "<string>",
"expireTime": "2023-11-07T05:31:56Z",
"minReplicaCount": 123,
"maxReplicaCount": 123,
"autoscalingPolicy": {
"scaleUpWindow": "<string>",
"scaleDownWindow": "<string>",
"scaleToZeroWindow": "<string>",
"loadTargets": {}
},
"baseModel": "<string>",
"acceleratorCount": 123,
"acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
"precision": "PRECISION_UNSPECIFIED",
"enableAddons": true,
"draftTokenCount": 123,
"draftModel": "<string>",
"ngramSpeculationLength": 123,
"deploymentTemplate": "<string>",
"autoTune": {
"longPrompt": true
},
"engine": "ENGINE_UNSPECIFIED",
"forTraining": true
}'
{
"name": "<string>",
"displayName": "<string>",
"description": "<string>",
"createTime": "2023-11-07T05:31:56Z",
"expireTime": "2023-11-07T05:31:56Z",
"purgeTime": "2023-11-07T05:31:56Z",
"deleteTime": "2023-11-07T05:31:56Z",
"createdBy": "<string>",
"state": "STATE_UNSPECIFIED",
"status": {
"code": "OK",
"message": "<string>"
},
"minReplicaCount": 123,
"maxReplicaCount": 123,
"replicaCount": 123,
"autoscalingPolicy": {
"scaleUpWindow": "<string>",
"scaleDownWindow": "<string>",
"scaleToZeroWindow": "<string>",
"loadTargets": {}
},
"baseModel": "<string>",
"acceleratorCount": 123,
"acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
"precision": "PRECISION_UNSPECIFIED",
"cluster": "<string>",
"enableAddons": true,
"draftTokenCount": 123,
"draftModel": "<string>",
"ngramSpeculationLength": 123,
"numPeftDeviceCached": 123,
"deploymentTemplate": "<string>",
"autoTune": {
"longPrompt": true
},
"region": "REGION_UNSPECIFIED",
"engine": "ENGINE_UNSPECIFIED",
"updateTime": "2023-11-07T05:31:56Z",
"forTraining": true
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Path Parameters
The Account Id
Query Parameters
By default, a deployment created with a currently undeployed base model will be deployed to this deployment. If true, this auto-deploy function is disabled.
By default, a deployment will use the speculative decoding settings from the base model. If true, this will disable speculative decoding.
The ID of the deployment. If not specified, a random ID will be generated.
Body
application/json
The properties of the deployment being created.
The body is of type object
.
Response
200 - application/json
A successful response.
The response is of type object
.
Was this page helpful?
curl --request POST \
--url https://api.fireworks.ai/v1/accounts/{account_id}/deployments \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"displayName": "<string>",
"description": "<string>",
"expireTime": "2023-11-07T05:31:56Z",
"minReplicaCount": 123,
"maxReplicaCount": 123,
"autoscalingPolicy": {
"scaleUpWindow": "<string>",
"scaleDownWindow": "<string>",
"scaleToZeroWindow": "<string>",
"loadTargets": {}
},
"baseModel": "<string>",
"acceleratorCount": 123,
"acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
"precision": "PRECISION_UNSPECIFIED",
"enableAddons": true,
"draftTokenCount": 123,
"draftModel": "<string>",
"ngramSpeculationLength": 123,
"deploymentTemplate": "<string>",
"autoTune": {
"longPrompt": true
},
"engine": "ENGINE_UNSPECIFIED",
"forTraining": true
}'
{
"name": "<string>",
"displayName": "<string>",
"description": "<string>",
"createTime": "2023-11-07T05:31:56Z",
"expireTime": "2023-11-07T05:31:56Z",
"purgeTime": "2023-11-07T05:31:56Z",
"deleteTime": "2023-11-07T05:31:56Z",
"createdBy": "<string>",
"state": "STATE_UNSPECIFIED",
"status": {
"code": "OK",
"message": "<string>"
},
"minReplicaCount": 123,
"maxReplicaCount": 123,
"replicaCount": 123,
"autoscalingPolicy": {
"scaleUpWindow": "<string>",
"scaleDownWindow": "<string>",
"scaleToZeroWindow": "<string>",
"loadTargets": {}
},
"baseModel": "<string>",
"acceleratorCount": 123,
"acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
"precision": "PRECISION_UNSPECIFIED",
"cluster": "<string>",
"enableAddons": true,
"draftTokenCount": 123,
"draftModel": "<string>",
"ngramSpeculationLength": 123,
"numPeftDeviceCached": 123,
"deploymentTemplate": "<string>",
"autoTune": {
"longPrompt": true
},
"region": "REGION_UNSPECIFIED",
"engine": "ENGINE_UNSPECIFIED",
"updateTime": "2023-11-07T05:31:56Z",
"forTraining": true
}
Assistant
Responses are generated using AI and may contain mistakes.