Skip to main content
Upload your own models from Hugging Face or elsewhere to deploy fine-tuned or custom-trained models optimized for your use case.
  • Multiple upload options – Upload from local files or directly from S3 buckets or Azure Blob Storage
  • Secure uploads – All uploads are encrypted and models remain private to your account by default

Requirements

Supported architectures

Fireworks supports most popular model architectures, including:

Required files

You’ll need standard Hugging Face model files: config.json, model weights (.safetensors or .bin), and tokenizer files.
The model files you will need to provide depend on the model architecture. In general, you will need:
  • Model configuration: config.json
    Fireworks does not support the quantization_config option in config.json.
  • Model weights in one of the following formats:
    • *.safetensors
    • *.bin
  • Weights index: *.index.json
  • Tokenizer file(s), e.g.:
    • tokenizer.model
    • tokenizer.json
    • tokenizer_config.json
If the requisite files are not present, model deployment may fail.Enabling chat completions: To enable the chat completions API for your custom base model, ensure your tokenizer_config.json contains a chat_template field. See the Hugging Face guide on Templates for Chat Models for details.

Uploading your model

For larger models, you can upload directly from cloud storage (S3 or Azure Blob Storage) for faster transfer instead of uploading from your local machine.
Upload from your local machine:
firectl create model <MODEL_ID> /path/to/files/
If you’re uploading an embedding model, add the --embedding flag.

Verifying your upload

After uploading, verify your model is ready to deploy:
firectl get model accounts/<ACCOUNT_ID>/models/<MODEL_NAME>
Look for State: READY in the output. Once ready, you can create a deployment.

Deploying your model

Once your model shows State: READY, create a deployment:
firectl create deployment accounts/<ACCOUNT_ID>/models/<MODEL_NAME> --wait
See the On-demand deployments guide for configuration options like GPU types, autoscaling, and quantization.

Publishing your model

By default, models are private to your account. Publish a model to make it available to other Fireworks users. When published:
  • Listed in the public model catalog
  • Deployable by anyone with a Fireworks account
  • Still hosted and controlled by your account
Publish a model:
firectl update model <MODEL_ID> --public
Unpublish a model:
firectl update model <MODEL_ID> --public=false

Importing fine-tuned models

In addition to models you fine-tune on the Fireworks platform, you can also upload your own custom fine-tuned models as LoRA adapters.

Requirements

Your custom LoRA addon must contain the following files:
  • adapter_config.json - The Hugging Face adapter configuration file
  • adapter_model.bin or adapter_model.safetensors - The saved addon file
The adapter_config.json must contain the following fields:
  • r - The number of LoRA ranks. Must be an integer between 4 and 64, inclusive
  • target_modules - A list of target modules. Currently the following target modules are supported:
    • q_proj
    • k_proj
    • v_proj
    • o_proj
    • up_proj or w1
    • down_proj or w2
    • gate_proj or w3
    • block_sparse_moe.gate
Additional fields may be specified but are ignored.

Enabling chat completions

To enable the chat completions API for your LoRA addon, add a fireworks.json file to the directory containing:
{
  "conversation_config": {
    "style": "jinja",
    "args": {
      "template": "<YOUR_JINJA_TEMPLATE>"
    }
  }
}

Uploading the LoRA adapter

To upload a LoRA addon, run the following command. The MODEL_ID is an arbitrary resource ID to refer to the model within Fireworks.
Only some base models support LoRA addons.
firectl create model <MODEL_ID> /path/to/files/ --base-model "accounts/fireworks/models/<BASE_MODEL_ID>"

Next steps