List Deployments

curl --request GET \
  --url https://api.fireworks.ai/v1/accounts/{account_id}/deployments \
  --header 'Authorization: Bearer <token>'

{
  "deployments": [
    {
      "baseModel": "<string>",
      "name": "<string>",
      "displayName": "<string>",
      "description": "<string>",
      "createTime": "2023-11-07T05:31:56Z",
      "expireTime": "2023-11-07T05:31:56Z",
      "purgeTime": "2023-11-07T05:31:56Z",
      "deleteTime": "2023-11-07T05:31:56Z",
      "state": "STATE_UNSPECIFIED",
      "status": {
        "code": "OK",
        "message": "<string>"
      },
      "minReplicaCount": 123,
      "maxReplicaCount": 123,
      "desiredReplicaCount": 123,
      "replicaCount": 123,
      "autoscalingPolicy": {
        "scaleUpWindow": "<string>",
        "scaleDownWindow": "<string>",
        "scaleToZeroWindow": "<string>",
        "loadTargets": {}
      },
      "acceleratorCount": 123,
      "acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
      "precision": "PRECISION_UNSPECIFIED",
      "cluster": "<string>",
      "enableAddons": true,
      "draftTokenCount": 123,
      "draftModel": "<string>",
      "ngramSpeculationLength": 123,
      "enableSessionAffinity": true,
      "directRouteApiKeys": [
        "<string>"
      ],
      "numPeftDeviceCached": 123,
      "directRouteType": "DIRECT_ROUTE_TYPE_UNSPECIFIED",
      "directRouteHandle": "<string>",
      "deploymentTemplate": "<string>",
      "autoTune": {
        "longPrompt": true
      },
      "placement": {
        "region": "REGION_UNSPECIFIED",
        "multiRegion": "MULTI_REGION_UNSPECIFIED",
        "regions": [
          "REGION_UNSPECIFIED"
        ]
      },
      "region": "REGION_UNSPECIFIED",
      "updateTime": "2023-11-07T05:31:56Z",
      "disableDeploymentSizeValidation": true,
      "enableMtp": true,
      "enableHotLoad": true,
      "hotLoadBucketType": "BUCKET_TYPE_UNSPECIFIED",
      "enableHotReloadLatestAddon": true,
      "deploymentShape": "<string>",
      "activeModelVersion": "<string>",
      "targetModelVersion": "<string>",
      "replicaStats": {
        "pendingSchedulingReplicaCount": 123,
        "downloadingModelReplicaCount": 123,
        "initializingReplicaCount": 123,
        "readyReplicaCount": 123
      },
      "maxWithRevocableReplicaCount": 123
    }
  ],
  "nextPageToken": "<string>",
  "totalSize": 123
}

GET

accounts

{account_id}

deployments

List Deployments

curl --request GET \
  --url https://api.fireworks.ai/v1/accounts/{account_id}/deployments \
  --header 'Authorization: Bearer <token>'

{
  "deployments": [
    {
      "baseModel": "<string>",
      "name": "<string>",
      "displayName": "<string>",
      "description": "<string>",
      "createTime": "2023-11-07T05:31:56Z",
      "expireTime": "2023-11-07T05:31:56Z",
      "purgeTime": "2023-11-07T05:31:56Z",
      "deleteTime": "2023-11-07T05:31:56Z",
      "state": "STATE_UNSPECIFIED",
      "status": {
        "code": "OK",
        "message": "<string>"
      },
      "minReplicaCount": 123,
      "maxReplicaCount": 123,
      "desiredReplicaCount": 123,
      "replicaCount": 123,
      "autoscalingPolicy": {
        "scaleUpWindow": "<string>",
        "scaleDownWindow": "<string>",
        "scaleToZeroWindow": "<string>",
        "loadTargets": {}
      },
      "acceleratorCount": 123,
      "acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
      "precision": "PRECISION_UNSPECIFIED",
      "cluster": "<string>",
      "enableAddons": true,
      "draftTokenCount": 123,
      "draftModel": "<string>",
      "ngramSpeculationLength": 123,
      "enableSessionAffinity": true,
      "directRouteApiKeys": [
        "<string>"
      ],
      "numPeftDeviceCached": 123,
      "directRouteType": "DIRECT_ROUTE_TYPE_UNSPECIFIED",
      "directRouteHandle": "<string>",
      "deploymentTemplate": "<string>",
      "autoTune": {
        "longPrompt": true
      },
      "placement": {
        "region": "REGION_UNSPECIFIED",
        "multiRegion": "MULTI_REGION_UNSPECIFIED",
        "regions": [
          "REGION_UNSPECIFIED"
        ]
      },
      "region": "REGION_UNSPECIFIED",
      "updateTime": "2023-11-07T05:31:56Z",
      "disableDeploymentSizeValidation": true,
      "enableMtp": true,
      "enableHotLoad": true,
      "hotLoadBucketType": "BUCKET_TYPE_UNSPECIFIED",
      "enableHotReloadLatestAddon": true,
      "deploymentShape": "<string>",
      "activeModelVersion": "<string>",
      "targetModelVersion": "<string>",
      "replicaStats": {
        "pendingSchedulingReplicaCount": 123,
        "downloadingModelReplicaCount": 123,
        "initializingReplicaCount": 123,
        "readyReplicaCount": 123
      },
      "maxWithRevocableReplicaCount": 123
    }
  ],
  "nextPageToken": "<string>",
  "totalSize": 123
}

Authorizations

Authorization

string

header

required

Bearer authentication using your Fireworks API key. Format: Bearer <API_KEY>

Path Parameters

account_id

string

required

The Account Id

Query Parameters

pageSize

integer<int32>

The maximum number of deployments to return. The maximum page_size is 200, values above 200 will be coerced to 200. If unspecified, the default is 50.

pageToken

string

A page token, received from a previous ListDeployments call. Provide this to retrieve the subsequent page. When paginating, all other parameters provided to ListDeployments must match the call that provided the page token.

filter

string

Only deployment satisfying the provided filter (if specified) will be returned. See https://google.aip.dev/160 for the filter grammar.

orderBy

string

A comma-separated list of fields to order by. e.g. "foo,bar" The default sort order is ascending. To specify a descending order for a field, append a " desc" suffix. e.g. "foo desc,bar" Subfields are specified with a "." character. e.g. "foo.bar" If not specified, the default order is by "create_time".

showDeleted

boolean

If set, DELETED deployments will be included.

readMask

string

The fields to be returned in the response. If empty or "*", all fields will be returned.

Response

200 - application/json

A successful response.

deployments

Next ID: 89 · object[]

Show child attributes

nextPageToken

string

A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages.

totalSize

integer<int32>

The total number of deployments.

Create Deployment

Get Deployment

⌘I

API Reference

Inference

Deployments

Fine-tuning

Evals

Multimedia

Admin

Build SDK (Deprecated)

List Deployments

Authorizations

Path Parameters

Query Parameters

Response