Deploying models

A model must be deployed before it can be used for inference. Fireworks deploys the most popular base models to serverless deployments, see Querying text models. Less popular base models or custom base models require an on-demand deployment.

Creating an on-demand deployment

To create an on-demand deployment, run:

firectl create deployment <MODEL_ID>

On-demand deployments are charged by GPU-hour. See Pricing for details.

Use the <MODEL_ID> specified during model upload. Creating the deployment will automatically deploy the base model to the deployment.

Checking whether a model is deployed

You can check the status of a model deployment by looking at the “Deployed Model Refs” section from:

firectl get model <MODEL_ID>

If successful, there will be an entry with State: DEPLOYED. Alternatively, you can list all deployed models within your account by running:

firectl list deployed-models

Inference

Model identifier

After your model is successfully deployed, it will be ready for inference. A model can be queried using one of the following model identifiers:

The model and deployment names - accounts/<ACCOUNT_ID of model>/models/<MODEL_ID>#accounts/<ACCOUNT_ID of deployment>/deployments/<DEPLOYMENT_ID>, e.g.
- accounts/fireworks/models/mixtral-8x7b#accounts/alice/deployments/12345678
- accounts/alice/models/custom-model#accounts/alice/deployments/12345678
The model and deployment short-names - <ACCOUNT_ID of model>/<MODEL_ID>#<ACCOUNT_ID of deployment>/<DEPLOYMENT_ID>, e.g.
- fireworks/mixtral-8x7b#alice/12345678
- alice/custom-model#alice/12345678
Deployed model name - Instead of needing to use both the model and deployment name to refer to a deployed model, you can optionally just use a unique deployed model name. This name utilizes a unique deployed model ID that is created upon deployment. The deployed model ID takes the form <MODEL_ID>-<AUTOGENERATED_SUFFIX&gt and can be viewed with “firectl list deployed-models”
- accounts/alice/deployedModels/mixtral-8x7b-abcdef
If you are deploying a custom model, you can also query it using the model name or model short-name, e.g.:
- accounts/alice/models/custom-model
- alice/custom-model

You can also use short names in place of the model and deployment names. For example:

<ACCOUNT_ID>/<MODEL_ID>
<ACCOUNT_ID>/<MODEL_ID>#<ACCOUNT_ID>/<DEPLOYMENT_ID>

Multiple deployments

Since a model may be deployed to multiple deployments, querying by model name will route to the “default” deployed model. You can see which deployed model entry is marked with Default: true by describing the model:

firectl get model <MODEL_ID>
...
Deployed Model Refs:
  [{
    Name: accounts/<ACCOUNT_ID>/deployedModels/<DEPLOYED_MODEL_ID_1>
    Deployment: accounts/<ACCOUNT_ID>/deployments/<DEPLOYMENT_ID_1>
    State: DEPLOYED
    Default: true
  },
  {
    Name: accounts/<ACCOUNT_ID>/deployedModels/<DEPLOYED_MODEL_ID_2>
    Deployment: accounts/<ACCOUNT_ID>/deployments/<DEPLOYMENT_ID_2>
    State: DEPLOYED
  },
]

To update the default deployed model, note the “Name” of the deployed model reference above. Then run:

firectl update deployed-model <DEPLOYED_MODEL_ID_2> --default

Deleting a default deployment: To delete a default deployment you must delete all other deployments for the same model first, or designate a different deployed model as the default as described above. This is to ensure that querying by model name will always route to an unambiguous default deployment as long as deployments for the model exist.

Querying the model

To test the model using the completions API, run:

curl \
  --header 'Authorization: Bearer <FIREWORKS_API_KEY>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "<MODEL_IDENTIFIER>",
    "prompt": "Say this is a test"
}' \
  --url https://api.fireworks.ai/inference/v1/completions

See Querying text models for a more comprehensive guide.

Publishing a deployed model

By default, models can only be queried by the account that owns them. To make a deployed model public so anyone with a valid Fireworks API key can query it, update the deployed model with the --public flag.

firectl update deployed-model <DEPLOYED_MODEL_ID> --public

To unpublish it, run:

firectl update deployed-model <DEPLOYED_MODEL_ID> --public=false

You must use the deployed model ID, not the model ID. To get a list of deployed models, run firectl list deployed-models.

Get Started

Querying models

Dedicated Deployments

Fine-tuning

Integrations

Policies

Administration

Creating an on-demand deployment

Checking whether a model is deployed

Inference

Model identifier

Multiple deployments

Querying the model

Publishing a deployed model

Get Started

Querying models

Dedicated Deployments

Fine-tuning

Integrations

Policies

Administration

​Creating an on-demand deployment

​Checking whether a model is deployed

​Inference

​Model identifier

​Multiple deployments

​Querying the model

​Publishing a deployed model

Creating an on-demand deployment

Checking whether a model is deployed

Inference

Model identifier

Multiple deployments

Querying the model

Publishing a deployed model