Custom Server Apps with Flumina
This page describes how to define and upload custom Server Apps to the Fireworks platform by using the Flumina toolkit.
Flumina is Fireworks.ai’s new system for hosting and running inference on arbitrary server apps and code. Flumina lets you deploy deep learning inference apps to production in minutes, not weeks.
Fireworks’ base inference APIs support inference for a variety of verticalized use cases, including:
- Out-of-the-box inference for text completion (LLM) models, image generation, and audio models.
- fine-tuned models or custom models within supported architectures
Flumina expands Fireworks’ functionality beyond the above by enabling you to deploy custom code to Fireworks. For example, Flumina excels in cases like:
- Custom audio models or pipelines, including components like Speech Separation, Voice Activity Detection, Speech Recognition, and Forced Alignment
- Image model pipelines, including components like text-to-image diffusion, upscaling, face-detection, face-swap, and more.
With Flumina, you upload Python code/models and get back a scalable, production-ready API endpoint for running these apps in the cloud.
Flumina also supports:
- Specifying dependencies to be downloaded via
requirements.txt
- Per-GPU-hour billing
Quickstart: Getting a Flumina Server App Running on the Cloud in Minutes
Note: This is an early preview of Flumina. Any feedback on user experience, feature requests, or bug reports would be greatly appreciated.
Prerequisites
First, create an account on fireworks.ai. Then install the fireworks-ai
python package, which contains the flumina
CLI utility.
Deploying your First Flumina App
The following commands will create a sample Flumina app, upload it to Fireworks, and deploy it for inference. Make sure to copy over your api key from https://fireworks.ai/account/api-keys and paste it in place of API_KEY
below
Upon successful deployment, flumina deploy
will print out an example command for calling into your newly deployed service, like so:
You can list your models and deployments with flumina list models
and flumina list deployments
respectively:
Given the deployment_id
and model_id
above, we can delete the deployment and model (order matters – all deployments of a model must be deleted before the model is deleted):
Note that account_id
is inferred for Flumina commands. To set it explicitly, pass the command line flag --account_id
.
Now that you’ve uploaded your first Server App, check out the Flumina reference to learn more about authoring Apps.
Was this page helpful?