Flumina is Fireworks.ai’s new system for hosting and running inference on arbitrary server apps and code. Flumina lets you deploy deep learning inference apps to production in minutes, not weeks. Fireworks’ base inference APIs support inference for a variety of verticalized use cases, including:Documentation Index
Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Out-of-the-box inference for text completion (LLM) models, image generation, and audio models.
- fine-tuned models or custom models within supported architectures
- Custom audio models or pipelines, including components like Speech Separation, Voice Activity Detection, Speech Recognition, and Forced Alignment
- Image model pipelines, including components like text-to-image diffusion, upscaling, face-detection, face-swap, and more.
- Specifying dependencies to be downloaded via
requirements.txt - Per-GPU-hour billing
Quickstart: Getting a Flumina Server App Running on the Cloud in Minutes
Note: This is an early preview of Flumina. Any feedback on user experience, feature requests, or bug reports would be greatly appreciated.Prerequisites
First, create an account on app.fireworks.ai. Then install thefireworks-ai python package, which contains the flumina CLI utility.
Deploying your First Flumina App
The following commands will create a sample Flumina app, upload it to Fireworks, and deploy it for inference. Make sure to copy over your api key from https://app.fireworks.ai/settings/users/api-keys and paste it in place ofAPI_KEY below
flumina deploy will print out an example command for calling into your newly deployed service, like so:
flumina list models and flumina list deployments respectively:
deployment_id and model_id above, we can delete the deployment and model (order matters – all deployments of a model must be deleted before the model is deleted):
account_id is inferred for Flumina commands. To set it explicitly, pass the command line flag --account_id.
Now that you’ve uploaded your first Server App, check out the Flumina reference to learn more about authoring Apps.