POST
/
v1
/
accounts
/
{account_id}
/
deployments
/
{deployment_id}
:undelete
Undelete Deployment
curl --request POST \
  --url https://api.fireworks.ai/v1/accounts/{account_id}/deployments/{deployment_id}:undelete \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{}'
{
  "name": "<string>",
  "displayName": "<string>",
  "description": "<string>",
  "createTime": "2023-11-07T05:31:56Z",
  "expireTime": "2023-11-07T05:31:56Z",
  "purgeTime": "2023-11-07T05:31:56Z",
  "deleteTime": "2023-11-07T05:31:56Z",
  "state": "STATE_UNSPECIFIED",
  "status": {
    "code": "OK",
    "message": "<string>"
  },
  "minReplicaCount": 123,
  "maxReplicaCount": 123,
  "replicaCount": 123,
  "autoscalingPolicy": {
    "scaleUpWindow": "<string>",
    "scaleDownWindow": "<string>",
    "scaleToZeroWindow": "<string>",
    "loadTargets": {}
  },
  "baseModel": "<string>",
  "acceleratorCount": 123,
  "acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
  "precision": "PRECISION_UNSPECIFIED",
  "cluster": "<string>",
  "enableAddons": true,
  "draftTokenCount": 123,
  "draftModel": "<string>",
  "ngramSpeculationLength": 123,
  "numPeftDeviceCached": 123,
  "deploymentTemplate": "<string>",
  "autoTune": {
    "longPrompt": true
  },
  "placement": {
    "region": "REGION_UNSPECIFIED",
    "multiRegion": "MULTI_REGION_UNSPECIFIED",
    "regions": [
      "REGION_UNSPECIFIED"
    ]
  },
  "region": "REGION_UNSPECIFIED",
  "updateTime": "2023-11-07T05:31:56Z",
  "disableDeploymentSizeValidation": true,
  "enableMtp": true
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

account_id
string
required

The Account Id

deployment_id
string
required

The Deployment Id

Body

application/json · object

Response

200 - application/json

A successful response.

baseModel
string
required
name
string
displayName
string

Human-readable display name of the deployment. e.g. "My Deployment" Must be fewer than 64 characters long.

description
string

Description of the deployment.

createTime
string<date-time>

The creation time of the deployment.

expireTime
string<date-time>

The time at which this deployment will automatically be deleted.

purgeTime
string<date-time>

The time at which the resource will be hard deleted.

deleteTime
string<date-time>

The time at which the resource will be soft deleted.

state
enum<string>
default:STATE_UNSPECIFIED

The state of the deployment.

Available options:
STATE_UNSPECIFIED,
CREATING,
READY,
DELETING,
FAILED,
UPDATING,
DELETED
status
object

Detailed status information regarding the most recent operation.

minReplicaCount
integer

The minimum number of replicas. If not specified, the default is 0.

maxReplicaCount
integer

The maximum number of replicas. If not specified, the default is max(min_replica_count, 1). May be set to 0 to downscale the deployment to 0.

replicaCount
integer
autoscalingPolicy
object
acceleratorCount
integer

The number of accelerators used per replica. If not specified, the default is the estimated minimum required by the base model.

acceleratorType
enum<string>
default:ACCELERATOR_TYPE_UNSPECIFIED

The type of accelerator to use. If not specified, the default is NVIDIA_A100_80GB.

Available options:
ACCELERATOR_TYPE_UNSPECIFIED,
NVIDIA_A100_80GB,
NVIDIA_H100_80GB,
AMD_MI300X_192GB,
NVIDIA_A10G_24GB,
NVIDIA_A100_40GB,
NVIDIA_L4_24GB,
NVIDIA_H200_141GB,
NVIDIA_B200_180GB
precision
enum<string>
default:PRECISION_UNSPECIFIED

The precision with which the model should be served.

Available options:
PRECISION_UNSPECIFIED,
FP16,
FP8,
FP8_MM,
FP8_AR,
FP8_MM_KV_ATTN,
FP8_KV,
FP8_MM_V2,
FP8_V2,
FP8_MM_KV_ATTN_V2,
NF4,
FP4,
BF16,
FP4_BLOCKSCALED_MM,
FP4_MX_MOE
cluster
string

If set, this deployment is deployed to a cloud-premise cluster.

enableAddons
boolean

If true, PEFT addons are enabled for this deployment.

draftTokenCount
integer

The number of candidate tokens to generate per step for speculative decoding. Default is the base model's draft_token_count. Set CreateDeploymentRequest.disable_speculative_decoding to false to disable this behavior.

draftModel
string

The draft model name for speculative decoding. e.g. accounts/fireworks/models/my-draft-model If empty, speculative decoding using a draft model is disabled. Default is the base model's default_draft_model. Set CreateDeploymentRequest.disable_speculative_decoding to false to disable this behavior.

ngramSpeculationLength
integer

The length of previous input sequence to be considered for N-gram speculation.

numPeftDeviceCached
integer
deploymentTemplate
string

The name of the deployment template to use for this deployment. Only available to enterprise accounts.

autoTune
object

The performance profile to use for this deployment.

placement
object

The desired geographic region where the deployment must be placed. If unspecified, the default is the GLOBAL multi-region.

region
enum<string>
default:REGION_UNSPECIFIED

The geographic region where the deployment is presently located. This region may change over time, but within the placement constraint.

Available options:
REGION_UNSPECIFIED,
US_IOWA_1,
US_VIRGINIA_1,
US_VIRGINIA_2,
US_ILLINOIS_1,
AP_TOKYO_1,
EU_LONDON_1,
US_ARIZONA_1,
US_TEXAS_1,
US_ILLINOIS_2,
EU_FRANKFURT_1,
US_TEXAS_2,
EU_PARIS_1,
EU_HELSINKI_1,
US_NEVADA_1,
EU_ICELAND_1,
EU_ICELAND_2,
US_WASHINGTON_1,
US_WASHINGTON_2,
US_WASHINGTON_3,
AP_TOKYO_2,
US_CALIFORNIA_1,
US_UTAH_1
updateTime
string<date-time>

The update time for the deployment.

disableDeploymentSizeValidation
boolean

Whether the deployment size validation is disabled.

enableMtp
boolean

If true, MTP is enabled for this deployment.