Flumina Server Apps Reference - Fireworks AI Docs

Authoring Flumina Server Apps

Prerequisites

Ensure you follow the installation steps in the Quick Start.

Initialize a Flumina Repository

The core organization concept in firectl is that of an app. An app is a directory that contains the following:

Code that defines the behavior of the Server App
- Crucially, flumina.py defines the “entrypoint” that the Fireworks infrastructure understands and uses to bring up the necessary infrastructure
(recommended) Serialized weights for deep learning models
- We highly recommend packaging weights inside the repository to be loaded locally on the server. Downloading weights at server initialization is discouraged.

As in the quick start, let’s initialize an empty repository using the flumina CLI tool that was installed with the fireworks-ai package:

mkdir flumina_test
cd flumina_test
flumina init app

The flumina init app command will add a few things in the present directory:

% ls
data
fireworks.json
flumina.py

fireworks.json marks the directory as a Flumina repository for downstream tools. You don’t need to edit this
flumina.py defines the entrypoint for the Server App. This is where you should add your model calling and business logic. We will cover this below
requirements.txt. This is where you can specify Python package dependencies in pip requirements format. The Fireworks service will install these before running your app.
The data directory is provided for convenience. It is recommended to place assets like model weights in this directory and reference them via relative paths within your code.
(Hidden files like .fluminaignore are also present. You can ignore these for now)

flumina.py: Server App Definition

flumina.py is the script that the Fireworks service invokes when running your Server App. FluminaModule is the main interface on which you will add your logic. The default template contains a FluminaModule with minimal functionality implemented. You can change this as you wish to develop your Server App. A summary of the key parts:

`init`

__init__ is where you should initialize assets to be used by the server. This includes deep learning modules (PyTorch nn.Modules) and auxiliary state.

Routes

The main way that logic in your Server App is exposed to the outside world is through routes. Flumina uses a design similar to FastAPI for route definition: methods on your FluminaModule should have a route decorator applied to them to expose the logic as an API endpoint. For example, from the template:

from pydantic import BaseModel
import fireworks.flumina.route as route

class ModuleRequest(BaseModel):
    input_val: int


class ModuleResponse(BaseModel):
    output_val: float

<...>

    @route.post("/infer")
    async def infer(self, input: ModuleRequest):
        # Add your endpoint logic here
        #
        # Example below
        model_out = self(input.input_val)
        return ModuleResponse(output_val=model_out.item())

This code exposes a route called /infer as an API endpoint. Pydantic objects are used to define the input and output data. Basically what this defines is an API that you can call. The example printed out by flumina deploy, for example, is generated from the registered routes on your app:

curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/my-account/models/my-flumina-app/infer?deployment=accounts/my-account/deployments/de614476' \
    -H 'Authorization: Bearer API_KEY' \
    -H "Content-Type: application/json" \
    -d '{
        "input_val": "<value>"
    }'

Entrypoints

The default app script has two entrypoints:

if __name__ == "__flumina_main__". Fireworks will call this entrypoint to deploy your module in the cloud. You don’t need to modify this in most cases.
if __name__ == "__main__". Call this entrypoint locally to test your model before deploying to Fireworks.

There are other aspects of FluminaModule that we will cover later.

Examples

We have used Flumina to deploy our newest production image generation workloads. See the following for examples:

FLUX models
- [dev] bfloat16, fp8
- [schnell] bfloat16, fp8
Stable Diffusion 3.5 Models
- Large, large turbo, medium

Documentation Index

​Authoring Flumina Server Apps

​Prerequisites

​Initialize a Flumina Repository

​flumina.py: Server App Definition

​__init__

​Routes

​Entrypoints

​Examples