Basics of the Build SDK
Why use the Build SDK?
The Fireworks Build SDK gives you a declarative way to work with Fireworks resources like deployments, fine-tuning jobs, and datasets. We’ve designed it to handle all the infrastructure complexity for you, letting you focus on building your application. Instead of using the web UI, CLI, or raw API calls, you can manage everything through simple Python code with smart, logical defaults without sacrificing control and customizability.
The principles of the SDK are the following:
-
Object-oriented: Fireworks primitives are represented as Python objects. You can access their capabilities and properties through methods and attributes.
-
Declarative: You can describe your desired state and the SDK will handle reconcilliation.
-
Smart defaults: The SDK will infer the most logical defaults for you, prioritizing development speed and lowest cost. Here are some examples:
- The SDK will automatically use a serverless deployment for models that are available serverlessly unless you specify otherwise.
- When creating deployments, the SDK will also enable scale-to-zero with the shortest possible scale-down window.
- If the SDK determines that a resource already exists by matching its signature (see below), it will re-use the existing resource instead of creating a new one.
-
Customizable: Although we enable smart defaults, you still have full access to the configuration parameters for any Fireworks resource
The Build SDK is currently in beta and not all functionality may be supported. Please reach out to dhuang@fireworks.ai to report any issues or feedback.
The LLM()
class
Running a model on Fireworks is as simple as instantiating the LLM
class and
calling a single function. Here’s an example of how to instantiate the latest
Llama 4 Maverick model using the Build SDK.
You can send various parameters to the LLM()
constructor to take full advantage of all of Fireworks’ customization options.
Fine-tuning a model
You can fine-tune a model by creating a Dataset
object and then calling the .create_supervised_fine_tuning_job()
method on the LLM
object.
Datasets are files in JSONL format, where each line represents a complete JSON-formatted training example following the Chat Completions API format. See fine-tuning a model for an example. Once you have training examples prepared, you can create a Dataset
object and upload the dataset to Fireworks by using from_file()
, from_string()
, or from_list()
, and pass it to the .create_supervised_fine_tuning_job()
method on the LLM
object as we did above.
Debug mode
Sometimes, it can be helpful to see the log of actions that the SDK is taking behind the scenes. You can enable debug mode by setting the FIREWORKS_SDK_DEBUG=True
environment variable.
Key concepts
Resource types
The SDK supports the following resource types:
LLM
- Represents a model running on a deploymentDataset
- Represents a dataset used to create a fine-tuning jobSupervisedFineTuningJob
- Represents a fine-tuning job
Deployment type selection
The SDK tries to be parsimonious with the way it deploys resources. We provide two types of deployment options on Fireworks:
serverless
hosting is enabled for some commonly-used state of the art models. The pricing for these models is per-token, i.e. you only pay for the tokens you use, and subject to rate limits.on-demand
hosting is enabled for all other models. The pricing for these models is per GPU-second. This hosting is required for models that are not available serverlessly or workloads that exceed serverless rate limits.
For non-finetuned models, you can always specify the deployment type of LLM()
by passing either "serverless"
or "on-demand"
as the deployment_type
parameter to the constructor. If the model is not available for the deployment type you selected, the SDK will throw an error. The SDK can also decide the best deployment strategy on your behalf, just pass deployment_type="auto"
. If the model is available serverlessly, the SDK will use serverless hosting, otherwise the SDK will create an on-demand deployment.
Be careful with the deployment_type
parameter, especially for "auto"
and "on-demand"
deployments. While the SDK will try to make the most cost effective choice for you and put sensible autoscaling policies in place, it is possible to unintentionally create many deployments that lead to unwanted spend, especially when working with non-serverless models.
For finetuned (LoRA) models, passing deployment_type="serverless"
will try to deploy the finetuned model to serverless hosting, deployment_type="on-demand"
will create an on-demand deployment of your base model and merge in your LoRA weights, deployment_type="on-demand-lora"
will create an on-demand deployment with Multi-LoRA enabled, and deployment_type="auto"
will try to use serverless
if available, otherwise fall back to on-demand-lora
.
Resource signatures
Each resource has a signature, which is the set of properties that are used to identify the resource. The SDK will use the signature to determine if a resource already exists and can be re-used.
Resource | Signature |
---|---|
LLM | display_name and model |
Dataset | hash(data) and filename |
SupervisedFineTuningJob | name and model |
For LLM
resources, the SDK will default to setting the display_name
to the filename of the script that instantiates it. You can override this by passing your own display_name
when creating the resource.
For Dataset
resources, the resource signature is derived from the filename
of your dataset (if created via from_file()
) and the hash of the data itself.
For SupervisedFineTuningJob
resources, you are required to pass a name
when creating the resource.
Resource management
The SDK also tries to be parsimonious with the number of resources it creates. Before creating a resource, the SDK will first check if a resource with the same signature already exists. If so, the SDK will re-use the existing resource instead of creating a new one. This could mean updating the resource with new configuration parameters, or re-using the existing resource.
A new resource will be created in the following cases for each resource type:
Resource | Created by SDK if… |
---|---|
LLM | You pass a unique display_name to the constructor, instantiate the LLM from a unique file, or use a unique model |
Dataset | You change the filename of the data or modify the data itself |
SupervisedFineTuningJob | You pass a unique name when creating the fine-tuning job or use a unique model |