Uploading a custom base model

In addition to the predefined set of models already available on Fireworks, you can also upload your own custom models. To upload a custom LoRA addon, see importing fine-tuned models.

Requirements

Fireworks currently supports the following model architectures:

The model files you will need to provide depend on the model architecture. In general, you will need the following files:

Model configuration: config.json.
Fireworks does not support the quantization_config option in config.json.
Model weights, in one of the following formats:
- *.safetensors
- *.bin
Weights index:*.index.json
Tokenizer file(s), e.g.
- tokenizer.model
- tokenizer.json
- tokenizer_config.json

If the requisite files are not present, model deployment may fail.

Enabling chat completions

To enable the chat completions API for your custom base model, ensure your tokenizer_config.json contains a chat_template field. See the Hugging Face guide on Templates for Chat Models for details.

Uploading the model locally (firectl)

To upload a custom base model, run the following command.

firectl create model <MODEL_ID> /path/to/files/

Uploading models from S3 buckets (firectl)

For larger models, you can upload directly from an Amazon S3 bucket, which provides a faster transfer process than uploading from local files. To upload a model directly from an S3 bucket, run the following command.

firectl create model <MODEL_ID> s3://<BUCKET_NAME>/<PATH_TO_MODEL>/ --aws-access-key-id <ACCESS_KEY_ID> --aws-secret-access-key <SECRET_ACCESS_KEY>

See the AWS documentation for how to generate an access key ID and secret access key pair.

Ensure the IAM user has read access to the S3 bucket containing the model.

Uploading via REST API (python)

For more programmatic control, you can use the Fireworks REST API to upload your custom models. This involves a four-step process:

Create a model object: This creates a reference to your model in the Fireworks system.
Get signed upload URLs: For each of your model files, you’ll get a unique URL to upload to.
Upload files: Upload each file to its corresponding signed URL.
Validate the upload: This tells Fireworks to verify the integrity of the uploaded files and make the model available for deployment.

Here’s a Python script demonstrating this process:

import os
import requests
import json

# Configuration
# Make sure to set your API key as an environment variable
API_KEY = os.environ.get("FIREWORKS_API_KEY")
ACCOUNT_ID = "my-account" # Replace with your account ID
MODEL_ID = "my-custom-model" # Replace with your desired model ID
MODEL_PATH = "/path/to/your/model/files/" # Path to your local model files

BASE_URL = "https://api.fireworks.ai/v1"
HEADERS = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json",
}


def check_model_exists():
    """Check if the model already exists."""
    print(f"Checking if model {MODEL_ID} already exists...")
    url = f"{BASE_URL}/accounts/{ACCOUNT_ID}/models/{MODEL_ID}"
    
    try:
        response = requests.get(url, headers=HEADERS)
        if response.status_code == 200:
            print(f"Model {MODEL_ID} already exists.")
            return True
        elif response.status_code == 404:
            print(f"Model {MODEL_ID} does not exist.")
            return False
        else:
            print(f"Unexpected status code: {response.status_code}")
            return False
    except Exception as e:
        print(f"Error checking model existence: {e}")
        return False


def create_model():
    """Create a model object."""
    print("Creating model...")
    url = f"{BASE_URL}/accounts/{ACCOUNT_ID}/models"
    payload = {
        "modelId": MODEL_ID,
        "model": {
            "kind": "CUSTOM_MODEL",
            "baseModelDetails": {
                "checkpointFormat": "HUGGINGFACE",
                "worldSize": 1,
            },
        },
    }
    
    try:
        response = requests.post(url, headers=HEADERS, data=json.dumps(payload))
        response.raise_for_status()
        print("Model created successfully.")
        return response.json()
    except requests.exceptions.HTTPError as e:
        print(f"HTTP Error creating model: {e}")
        print(f"Response content: {e.response.text}")
        raise


def get_upload_urls():
    """Get signed upload URLs for model files."""
    print("Getting upload URLs...")
    
    if not os.path.exists(MODEL_PATH):
        print(f"Error: Model path '{MODEL_PATH}' does not exist!")
        return None
    
    # Get all files in the directory
    all_files = []
    for f in os.listdir(MODEL_PATH):
        file_path = os.path.join(MODEL_PATH, f)
        if os.path.isfile(file_path):
            all_files.append(f)
    
    filename_to_size = {
        f: os.path.getsize(os.path.join(MODEL_PATH, f))
        for f in all_files
    }
    
    print(f"Files to upload: {list(filename_to_size.keys())}")
    print(f"Total files: {len(filename_to_size)}")

    url = f"{BASE_URL}/accounts/{ACCOUNT_ID}/models/{MODEL_ID}:getUploadEndpoint"
    payload = {
        "filenameToSize": filename_to_size,
        "enableResumableUpload": False 
    }
    
    try:
        response = requests.post(url, headers=HEADERS, data=json.dumps(payload))
        response.raise_for_status()
        print("Upload URLs received.")
        return response.json()["filenameToSignedUrls"]
    except requests.exceptions.HTTPError as e:
        print(f"HTTP Error getting upload URLs: {e}")
        print(f"Response content: {e.response.text}")
        raise


def upload_files(upload_urls):
    """Upload model files using signed URLs."""
    print("Uploading files...")
    
    for filename, url in upload_urls.items():
        print(f"  Uploading {filename}...")
        file_path = os.path.join(MODEL_PATH, filename)
        file_size = os.path.getsize(file_path)
        
        headers = {
            "Content-Type": "application/octet-stream",
            "x-goog-content-length-range": f"{file_size},{file_size}"
        }
        
        try:
            with open(file_path, "rb") as f:
                response = requests.put(url, data=f, headers=headers)
            response.raise_for_status()
            print(f"    Successfully uploaded {filename}")
        except requests.exceptions.HTTPError as e:
            print(f"    Error uploading {filename}: {e}")
            raise
    
    print("All files uploaded successfully.")


def validate_upload():
    """Validate the uploaded model."""
    print("Validating model upload...")
    url = f"{BASE_URL}/accounts/{ACCOUNT_ID}/models/{MODEL_ID}:validateUpload"
    
    try:
        response = requests.get(url, headers=HEADERS)
        response.raise_for_status()
        print("Model validation successful. Your model is ready to be deployed.")
    except requests.exceptions.HTTPError as e:
        print(f"HTTP Error validating upload: {e}")
        print(f"Response content: {e.response.text}")
        raise


def main():
    """Main function to orchestrate the model upload process."""
    try:
        if check_model_exists():
            print("Model already exists. Exiting.")
            return
        
        create_model()
        urls = get_upload_urls()
        if urls:
            upload_files(urls)
            validate_upload()
        else:
            print("No valid upload URLs received. Exiting.")
    except Exception as e:
        print(f"Error during model upload process: {e}")
        raise


if __name__ == "__main__":
    main()

Deploying

A model cannot be used for inference until it is deployed. See the Deploying models guide to deploy the model.

Publishing

By default, all models you create are only visible to and deployable by users within your account. To publish a model so anyone with a Fireworks account can deploy it, you can create it with the --public flag. This will allow it to show up in public model lists.

firectl create model <MODEL_ID> /path/to/files --public

To unpublish the model, just run

update

firectl update model <MODEL_ID> --public=false

Get Started

Querying models

Dedicated Deployments

Fine-tuning

Integrations

Policies

Administration

Requirements

Enabling chat completions

Uploading the model locally (firectl)

Uploading models from S3 buckets (firectl)

Uploading via REST API (python)

Deploying

Publishing

Get Started

Querying models

Dedicated Deployments

Fine-tuning

Integrations

Policies

Administration

​Requirements

​Enabling chat completions

​Uploading the model locally (firectl)

​Uploading models from S3 buckets (firectl)

​Uploading via REST API (python)

​Deploying

​Publishing

Requirements

Enabling chat completions

Uploading the model locally (firectl)

Uploading models from S3 buckets (firectl)

Uploading via REST API (python)

Deploying

Publishing