Flumina is Fireworks.ai’s new system for hosting and running inference on arbitrary server apps and code. Flumina lets you deploy deep learning inference apps to production in minutes, not weeks.

Fireworks’ base inference APIs support inference for a variety of verticalized use cases, including:

  1. Out-of-the-box inference for text completion (LLM) models, image generation, and audio models.
  2. fine-tuned models or custom models within supported architectures

Flumina expands Fireworks’ functionality beyond the above by enabling you to deploy custom code to Fireworks. For example, Flumina excels in cases like:

  1. Custom audio models or pipelines, including components like Speech Separation, Voice Activity Detection, Speech Recognition, and Forced Alignment
  2. Image model pipelines, including components like text-to-image diffusion, upscaling, face-detection, face-swap, and more.

With Flumina, you upload Python code/models and get back a scalable, production-ready API endpoint for running these apps in the cloud.

Flumina also supports:

  • Specifying dependencies to be downloaded via requirements.txt
  • Per-GPU-hour billing

Quickstart: Getting a Flumina Server App Running on the Cloud in Minutes

Note: This is an early preview of Flumina. Any feedback on user experience, feature requests, or bug reports would be greatly appreciated.

Prerequisites

First, create an account on fireworks.ai. Then install the fireworks-ai python package, which contains the flumina CLI utility.

pip install --upgrade 'fireworks-ai[flumina]>= 0.15.8'

Deploying your First Flumina App

The following commands will create a sample Flumina app, upload it to Fireworks, and deploy it for inference. Make sure to copy over your api key from https://fireworks.ai/account/api-keys and paste it in place of API_KEY below

mkdir flumina_app && cd flumina_app
flumina set-api-key API_KEY
flumina init app
flumina deploy my-flumina-app

Upon successful deployment, flumina deploy will print out an example command for calling into your newly deployed service, like so:

curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/my-account/models/my-flumina-app/infer?deployment=accounts/my-account/deployments/de614476' \
    -H 'Authorization: Bearer API_KEY' \
    -H "Content-Type: application/json" \
    -d '{
        "input_val": "<value>"
    }'

You can list your models and deployments with flumina list models and flumina list deployments respectively:

$ flumina list models
account_id    model_id                     state    base_model_id                          url_prefix
my-account     my-flumina-app               READY                                           https://api.fireworks.ai/inference/v1/workflows/accounts/my-account/models/my-flumina-app
$ flumina list deployments
account_id    deployment_id    state    base_model_id                             url_prefix
my-account     de614476         READY    accounts/my-account/models/my-flumina-app  https://api.fireworks.ai/inference/v1/workflows/accounts/my-account/models/my-flumina-app?deployment=accounts/my-account/deployments/de614476

Given the deployment_id and model_id above, we can delete the deployment and model (order matters – all deployments of a model must be deleted before the model is deleted):

$ flumina delete deployment de614476
$ flumina delete model my-flumina-app

Note that account_id is inferred for Flumina commands. To set it explicitly, pass the command line flag --account_id.

Now that you’ve uploaded your first Server App, check out the Flumina reference to learn more about authoring Apps.