Skip to main content
Supervised Fine-Tuning (SFT) is critical for adapting general-purpose Large Language Models (LLMs) to domain-specific tasks, significantly improving performance in real-world applications. Fireworks AI facilitates easy and scalable SFT through its intuitive APIs and support for Low-Rank Adaptation (LoRA), allowing efficient fine-tuning without full parameter updates.

Understanding LoRA

LoRA significantly reduces the computational and memory cost of fine-tuning large models by updating the LLM parameters in a low‑rank structure, making it particularly suitable for large models like LLaMA or DeepSeek. Fireworks AI supports LoRA tuning, allowing up to 100 LoRA adaptations to run simultaneously on a dedicated deployment without extra cost.

List of Supported Models

We currently support fine-tuning Llama 3 models, Qwen 2/2.5 models, Qwen 3 non-MoE models, DeepSeek V3/R1 models, Gemma 3 models and Phi 3/4 models; as well as any custom model with these supported architectures. When you create a supervised fine tuning job in UI following the tutorial below, the list of models in the dropdown are all supported. You will notice that the LoRA models for those supported architecures are also available in the dropdown because we also support continue fine-tuning from existing LoRA models. Supported Models Pn

Step-by-Step Guide to Fine-Tuning with Fireworks AI

1. Preparing the Dataset

Datasets must adhere strictly to the JSONL format, where each line represents a complete JSON-formatted training example. Minimum Requirements:
  • Minimum examples needed: 3
  • Maximum examples: Up to 3 million examples per dataset
  • File format: JSONL (each line is a valid JSON object)
  • Message Schema: Each training sample must include a messages array, where each message is an object with two fields:
    • role: one of system, user, or assistant
    • content: a string representing the message content
  • Sample weight: Optional key weight at the root of the JSON object. It can be any floating point number (positive, negative, or 0) and is used as a loss multiplier for tokens in that sample. If used, this field must be present in all samples in the dataset.
Here’s an example conversation dataset (one training example):
{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"},
    {"role": "assistant", "content": "Paris."}
  ]
}
Here’s an example conversation dataset with sample weights:
{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"},
    {"role": "assistant", "content": "Paris."}
  ],
  "weight": 0.5
}
We also support function calling dataset with a list of tools. An example would look like:
{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_car_specs",
        "description": "Fetches detailed specifications for a car based on the given trim ID.",
        "parameters": {
          "trimid": {
            "description": "The trim ID of the car for which to retrieve specifications.",
            "type": "int",
            "default": ""
          }
        }
      }
    },
],
  "messages": [
    {
      "role": "user",
      "content": "What is the specs of the car with trim 121?"
    },
    {
      "role": "assistant",
      "tool_calls": [
        {
          "type": "function",
          "function": {
            "name": "get_car_specs",
            "arguments": "{\"trimid\": 121}"
          }
        }
      ]
    }
  ]
}
Save this dataset as jsonl file locally, for example trader_poe_sample_data.jsonl.

2. Uploading the Dataset to Fireworks AI

There are a couple ways to upload the dataset to Fireworks platform for fine tuning: firectl, Restful API , builder SDK or UI.
  • Upload dataset through UI You can simply navigate to the dataset tab, click Create Dataset and follow the wizard. Dataset Pn
  • Upload dataset using firectl
firectl create dataset <dataset-id> /path/to/jsonl/file
  • Upload dataset using Restful API You need to make two separate HTTP requests. One for creating the dataset entry and one for uploading the dataset. Full reference here: Create dataset. Note that the exampleCount parameter needs to be provided by the client.
// Create Dataset Entry
const createDatasetPayload = {
  datasetId: "trader-poe-sample-data",
  dataset: { userUploaded: {} }
  // Additional params such as exampleCount
};
const urlCreateDataset = `${BASE_URL}/datasets`;
const response = await fetch(urlCreateDataset, {
  method: "POST",
  headers: HEADERS_WITH_CONTENT_TYPE,
  body: JSON.stringify(createDatasetPayload)
});
// Upload JSONL file
const urlUpload = `${BASE_URL}/datasets/${DATASET_ID}:upload`;
const files = new FormData();
files.append("file", localFileInput.files[0]);

const uploadResponse = await fetch(urlUpload, {
  method: "POST",
  headers: HEADERS,
  body: files
});
While all of the above approaches should work, UI is more suitable for smaller datasets < 500MB while firectl might work better for bigger datasets.

3. Creating a Fine-Tuning Job

Similarly, there are a couple different approaches for creating the supervised fine tuning job. We highly recommend creating supervised fine tuning jobs via UI . To do that, simply navigate to the Fine-Tuning tab, click Fine-Tune a Model and follow the wizard from there. Fine Tuning Pn Create Sftj Pn

4. Monitoring and Managing Fine-Tuning Jobs

Once the job is created, it will show in the list of jobs. Clicking to view the job details Sftj Details Pn

5. Deploying the Fine-Tuned Model

After fine-tuning is complete, you have three options for deploying the model.
  • Serverless Deployment: Only supported for specific models that we have serverless support (the list will update from time to time, model in Model Library that has the serverless tag).
  • Dedicated Deployment: Create a deployment, and then you can deploy the LoRA model ontop the deployment as addon. You can deploy 100s of LoRA model as addons onto the single deployment. Learn more about it here: multi-lora deployment
  • Live-merge Deployment: Create a deployment by merging the LoRA weights into the base model directly, optimized for latency and speed
For guide on how to create deployment (dedicated or live-merge deployment), please follow the guide here: creating deployment. For guide on how to deploy a LoRA addon to an existing deployment or serverless, you can simply click the Deploy this LoRA button in the supervised fine tuning job details page or LoRA model details page, and follow the wizard.

6. Best Practices and Considerations

  • Validate your dataset thoroughly before uploading.
  • Use a higher loraRank for larger model capacity (e.g., 8 for complex tasks). If loraRank is higher, consider using smaller learning rate too.
  • Monitor job health and logs for errors.
  • We generally recommend using earlyStop = False for training
  • Use descriptive names for dataset IDs and models for clarity.

Appendix

Python builder SDK references Restful API references firectl references