Deploying LoRA models
Installing firectl
firectl
is the command-line (CLI) utiliy to manage, and deploy various resources on the Fireworks AI Platform. Use firectl
to manage your LLM models.
Please visit the Firectl Getting Started Guide on installing and using firectl
.
Uploading a fine-tuned model
Make sure to review the requirements for a fine-tuned model. Sample configs for supported models are available here.
To upload a fine-tuned model located at /tmp/falcon-7b-addon/
, run:
firectl create model my-model /tmp/falcon-7b-addon/
Once uploaded, you can see your model with:
firectl list models
Deploying your model
To deploy the model for inference, run:
firectl deploy my-model
Testing your model
Once your model is deployed, you can query it on the model page.
- Visit the list of your models.
- Click on the model you deployed.
- Enter your text prompt and click “Generate Completion”
You should see your model’s response streamed below.
Using the API
You can also directly query the model using the /v1/completions
API:
curl \
-H "Authorization: Bearer ${API_KEY}" \
-H "Content-Type: application/json" \
-d '{"model": "accounts/<ACCOUNT_ID>/models/my-model", "prompt": "hello, the sky is"}' \
https://api.fireworks.ai/inference/v1/completions
import fireworks.client
fireworks.client.configure(
api_base="https://api.fireworks.ai/inference",
api_key="<API_KEY>",
)
fireworks.client.Completion.create(
model="falcon-7b-addon",
prompt="Say this is a test",
max_tokens=7,
temperature=0,
)
Cleaning up
Now that you are finished with the guide, you can undeploy the models to avoid accruing charges on your account:
firectl undeploy my-model
You can also delete the model from your account:
firectl delete model my-model
Deployment limits
Non-enterprise accounts are limited to a maximum of 100 deployed models.
Was this page helpful?