Understanding LoRA
LoRA significantly reduces the computational and memory cost of fine-tuning large models by updating the LLM parameters in a low‑rank structure, making it particularly suitable for large models like LLaMA or DeepSeek. Fireworks AI supports LoRA tuning, allowing up to 100 LoRA adaptations to run simultaneously on a dedicated deployment without extra cost.List of Supported Models
We currently support fine-tuning models with the following architecture:- Llama 3
- Llama 4 (text only)
- Qwen 2/2.5
- Qwen 2/2.5 VL (see VLM fine tuning section)
- Qwen 3 models (MoE and dense, multi-turn fine tuning with thinking trace support)
- DeepSeek V3/R1 (multi-turn fine tuning R1 with thinking trace support)
- Gemma 3
- Phi 3/4
- GPT OSS (multi-turn fine tuning with thinking trace support)
- any custom model with above supported architectures
Fine-tuning a model using SFT
1
Confirm model support for fine-tuning
You can confirm that a base model is available to fine-tune by looking for the And looking for
Tunnable
tag in the model library or by using:Tunable: true
.Some base models cannot be tuned on Fireworks (
Tunable: false
) but still list support for LoRA (Supports Lora: true
). This means that users can tune a LoRA for this base model on a separate platform and upload it to Fireworks for inference. Consult importing fine-tuned models for more information.2
Prepare a dataset
Datasets must be in JSONL format, where each line represents a complete JSON-formatted training example. Make sure your data conforms to the following restrictions:We also support function calling dataset with a list of tools. An example would look like:For the subset of models that supports thinking (e.g. DeepSeek R1, GPT OSS models and Qwen3 thinking models), we also support fine tuning with thinking traces. If you wish to fine tune with thinking traces, the dataset could also include thinking traces for assistant turns. Though optional, ideally each assistant turn includes a thinking trace. For example:Note that when fine tuning with intermediate thinking traces, the number of total tuned tokens could exceed the number of total tokens in the dataset. This is because we perform preprocessing and expand the dataset to ensure train-inference consistency.
- Minimum examples: 3
- Maximum examples: 3 million per dataset
- File format:
.jsonl
- Message schema: Each training sample must include a messages array, where each message is an object with two fields:
role
: one ofsystem
,user
, orassistant
. A message with thesystem
role is optional, but if specified, it must be the first message of the conversationcontent
: a string representing the message contentweight
: optional key with value to be configured in either 0 or 1. message will be skipped if value is set to 0
3
Create and upload a dataset
There are a couple ways to upload the dataset to Fireworks platform for fine tuning: While all of the above approaches should work,
firectl
, Restful API
, builder SDK
or UI
.-
You can simply navigate to the dataset tab, click
Create Dataset
and follow the wizard.
UI
is more suitable for smaller datasets < 500MB
while firectl
might work better for bigger datasets.Ensure the dataset ID conforms to the resource id restrictions.4
Launch a fine-tuning job
There are also a couple ways to launch the fine-tuning jobs. We highly recommend creating supervised fine tuning jobs via With 
With Once the job successfully completes, you will see the new LoRA model in your model list
UI
.Simply navigate to the 

Fine-Tuning
tab, click Fine-Tune a Model
and follow the wizard from there. You can even pick a LoRA model to start the fine-tuning for continued training.

UI
, once the job is created, it will show in the list of jobs. Clicking to view the job details to monitor the job progress.
firectl
, you can monitor the progress of the tuning job by runningDeploying a fine-tuned model using an on-demand deployment
Use the following command to deploy your fine-tuned model using an on-demand deployment:Deploying a fine-tuned model serverlessly
Not all base models support serverless addons. Please check this list for the full list of serverless models that support LoRA add-ons.
Unused addons may be automatically unloaded after a week.
Additional SFT job settings
Additional tuning settings are available when starting a fine-tuning job. All of the below settings are optional and will have reasonable defaults if not specified. For settings that affect tuning quality likeepochs
and learning rate
, we recommend using default settings and only changing hyperparameters if results are not as desired. All tuning options must be specified via command line flags as shown in the below example:
evaluation_dataset
: The ID of a separate dataset to use for evaluation. Must be pre-uploaded via firectl
Appendix
Python builder SDK
references
Restful API
references
firectl
references