curl --request GET \
--url https://api.fireworks.ai/v1/accounts/{account_id}/deploymentShapes/{deployment_shape_id} \
--header 'Authorization: Bearer <token>'{
"baseModel": "<string>",
"name": "<string>",
"displayName": "<string>",
"description": "<string>",
"createTime": "2023-11-07T05:31:56Z",
"updateTime": "2023-11-07T05:31:56Z",
"modelType": "<string>",
"parameterCount": "<string>",
"acceleratorCount": 123,
"acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
"precision": "PRECISION_UNSPECIFIED",
"disableDeploymentSizeValidation": true,
"enableAddons": true,
"draftTokenCount": 123,
"draftModel": "<string>",
"ngramSpeculationLength": 123,
"disableSpeculativeDecoding": true,
"enableSessionAffinity": true,
"numLoraDeviceCached": 123,
"maxContextLength": 123,
"presetType": "PRESET_TYPE_UNSPECIFIED"
}curl --request GET \
--url https://api.fireworks.ai/v1/accounts/{account_id}/deploymentShapes/{deployment_shape_id} \
--header 'Authorization: Bearer <token>'{
"baseModel": "<string>",
"name": "<string>",
"displayName": "<string>",
"description": "<string>",
"createTime": "2023-11-07T05:31:56Z",
"updateTime": "2023-11-07T05:31:56Z",
"modelType": "<string>",
"parameterCount": "<string>",
"acceleratorCount": 123,
"acceleratorType": "ACCELERATOR_TYPE_UNSPECIFIED",
"precision": "PRECISION_UNSPECIFIED",
"disableDeploymentSizeValidation": true,
"enableAddons": true,
"draftTokenCount": 123,
"draftModel": "<string>",
"ngramSpeculationLength": 123,
"disableSpeculativeDecoding": true,
"enableSessionAffinity": true,
"numLoraDeviceCached": 123,
"maxContextLength": 123,
"presetType": "PRESET_TYPE_UNSPECIFIED"
}Bearer authentication using your Fireworks API key. Format: Bearer <API_KEY>
The Account Id
The Deployment Shape Id
The fields to be returned in the response. If empty or "*", all fields will be returned.
If true, returns the latest version regardless of validation status. By default, returns the latest validated version.
A successful response.
Human-readable display name of the deployment shape. e.g. "My Deployment Shape" Must be fewer than 64 characters long.
The description of the deployment shape. Must be fewer than 1000 characters long.
The creation time of the deployment shape.
The update time for the deployment shape.
The model type of the base model.
The parameter count of the base model .
The number of accelerators used per replica. If not specified, the default is the estimated minimum required by the base model.
The type of accelerator to use. If not specified, the default is NVIDIA_A100_80GB.
ACCELERATOR_TYPE_UNSPECIFIED, NVIDIA_A100_80GB, NVIDIA_H100_80GB, AMD_MI300X_192GB, NVIDIA_A10G_24GB, NVIDIA_A100_40GB, NVIDIA_L4_24GB, NVIDIA_H200_141GB, NVIDIA_B200_180GB, AMD_MI325X_256GB, AMD_MI350X_288GB, NVIDIA_B300_288GB The precision with which the model should be served.
PRECISION_UNSPECIFIED, FP16, FP8, FP8_MM, FP8_AR, FP8_MM_KV_ATTN, FP8_KV, FP8_MM_V2, FP8_V2, FP8_MM_KV_ATTN_V2, NF4, FP4, BF16, FP4_BLOCKSCALED_MM, FP4_MX_MOE If true, the deployment size validation is disabled.
If true, LORA addons are enabled for deployments created from this shape.
The number of candidate tokens to generate per step for speculative decoding. Default is the base model's draft_token_count.
The draft model name for speculative decoding. e.g. accounts/fireworks/models/my-draft-model If empty, speculative decoding using a draft model is disabled. Default is the base model's default_draft_model. Deprecated: set default_draft_model on the base model instead.
The length of previous input sequence to be considered for N-gram speculation.
If true, speculative decoding is disabled for deployments created from this shape, even if the base model has default draft model settings.
Whether to apply sticky routing based on user field.
The maximum context length supported by the model (context window). If set to 0 or not specified, the model's default maximum context length will be used.
Type of deployment shape for different deployment configurations.
PRESET_TYPE_UNSPECIFIED, MINIMAL, FAST, THROUGHPUT, FULL_PRECISION, AGENTIC_CODING, CHAT, SUMMARIZATION Was this page helpful?