Onboarding
A quick guide to navigating and building with the Fireworks platform.
Introduction
Welcome to the Fireworks onboarding guide!
This guide is designed to help you quickly and effectively get started with the Fireworks platform, whether you’re a developer, researcher, or AI enthusiast. By following this step-by-step resource, you’ll learn how to explore and experiment with state-of-the-art AI models, prototype your ideas using Fireworks’ serverless infrastructure, and scale your projects with advanced on-demand deployments.
Who this guide is for
This guide is designed for new Fireworks users who are exploring the platform for the first time. It provides a hands-on introduction to the core features of Fireworks, including the model library, playgrounds, and on-demand deployments, all accessible through the web app.
For experienced users, this guide serves as a starting point, with future resources planned to dive deeper into advanced tools like firectl
and other intermediate features to enhance your workflow.
Objectives of the guide
- Explore the Fireworks model library: Navigate and select generative AI models for text, image, and audio tasks.
- Experiment with the playground: Test prompts, tweak parameters, and generate outputs in real time.
- Prototype effortlessly: Use Fireworks’ serverless infrastructure to deploy and iterate without managing servers.
- Scale your AI: Learn how on-demand deployments offer predictable performance and advanced customization.
- Develop complex systems: Unlock advanced capabilities like Compound AI, function calling, and retrieval-augmented generation to create production-ready applications.
By the end of this guide, you’ll be equipped with the knowledge and tools to confidently use Fireworks to build, scale, and optimize AI-powered solutions. Let’s get started!
Step 1. Explore our model library
Fireworks provides a range of leading open-source models for tasks like text generation, code generation, and image understanding.
With the Fireworks model library, you can choose from our wide range of popular LLMs, VLMs, LVMs, and audio models, such as:
- LLMs: Llama 3.3 70B, Deepseek V3, and Qwen2.5 Coder 32B Instruct.
- VLMs: Llama 3.2 90B Vision Instruct.
- Vision models: BFL’s FLUX.1 [dev] FP8 and Stability.ai’s Stable Diffusion 3.5 Large Turbo.
- Audio models: Whisper V3 and (blazing fast)Whisper V3 Turbo.
as well as embedding models from Nomic AI.
Developers building generative AI applications can interact with Fireworks in multiple ways:
- 🌐 Via the web app: Access the Fireworks platform directly in your browser for easy model management.
- 🐍 Through our Python SDK: Programmatically integrate and manage models within your codebase.
- 🔗 With external providers: Pass your Fireworks API key to third-party tools for seamless workflow integration.
For additional documentation and guides, check out our Cookbook, which includes community-contributed notebooks and applications.
Action items
- 👀 Browse the model library: Explore our open and closed-source models.
- 📚 Read real-world use cases: See how customers are building production systems like:
- 👋 Join our Discord community: Connect and share your projects.
Step 2. Experiment using the model playground
The easiest way to get started with Fireworks and test models with minimal setup is through the Model Playground. Here, you can experiment with prompts, adjust parameters, and get immediate feedback on results before moving to more advanced steps.
Take a closer look at how the LLM Playground lets you experiment with text-based models.
Discover how the Image Playground transforms visual AI experimentation into an intuitive process.
Experience how the Audio Playground empowers advanced audio transcription and translation tasks.
Each model in the Playground includes the following features, designed to enhance your experimentation and streamline your workflow:
-
🎛️ Parameter controls: Adjust settings like temperature and max tokens for LLMs or image-specific parameters (e.g., Guidance Scale) for image generation models. These controls allow you to fine-tune the behavior and outputs of the models, helping you achieve the desired results for different use cases.
-
🧩 Code samples: Copy-paste ready-to-use code in Python, Typescript, or Shell to integrate models directly into your applications. This eliminates the guesswork of API implementation and speeds up development, so you can focus on building impactful solutions.
-
🎨 Additional UI elements: Leverage interactive features like file upload buttons for image or audio inputs, making it easy to test multimodal capabilities without any additional setup. This ensures a smooth, hands-on testing experience, even for complex workflows.
-
🔍 Model ID: Clearly displayed in the format
account/fireworks/models/<model_name>
, allowing you to switch between models effortlessly with a single line of code, making experimentation and integration faster and more efficient.
Action items
-
💻 🖱️ Sign into your account and explore various models, including:
- LLMs and VLMs: Llama 3.3 70B, Llama 3.2 90B Vision Instruct
- VLMs: FLUX.1 [dev] FP8
- Audio models: Whisper V3 Turbo
-
❓ Have questions, comments, or feedback? Head over to Discord and post in:
#feature-requests
#questions
#bug-reports
-
📚 Check out sampling options: Review the sampling options for text models to see the parameters we currently support.
Step 3. Prototyping with serverless
Fireworks’ serverless infrastructure lets you quickly prototype AI models without managing servers or committing to long-term contracts. This setup supports fast experimentation and seamless scaling for your projects.
Why use Fireworks serverless?
- 🚀 Launch instantly: Deploy apps with no setup or configuration required.
- 🎯 Focus on prompt engineering: Design and refine your prompts without worrying about infrastructure.
- ⚙️ Adjust parameters easily: Modify settings like temperature and max tokens to customize model outputs.
- 💰 Pay-as-you-go: Only pay for what you use, with pricing based on parameter size buckets, making it cost-effective for projects of any size.
To start prototyping, you’ll need to obtain your API key, which allows you to interact with Fireworks’ serverless models programmatically.
Using your API key
Your API key is essential for securely accessing and managing your serverless deployments. Here’s how to use it:
- Via the API: Include your API key in the headers of your RESTful API requests to integrate Fireworks’ models into your applications.
- Using our SDK: Configure the Fireworks Python library with your API key to manage and deploy models programmatically.
- Through third-party tools: Pass your API key to third-party clients (like LangChain) to incorporate Fireworks into your existing workflows, enabling you to use serverless models seamlessly.
Additionally, Fireworks is OpenAI compatible, enabling you to leverage familiar OpenAI tools and integrations within your Fireworks projects.
Action items
- 🔑 Get your API key: Navigate to your account settings and generate your API key to authenticate your requests.
- 📓 Call a serverless model: See how you can call a serverless model using a sample notebook.
- 🔖 Read the API usage guide: Understand the different endpoints and parameters available for use in your projects.
- 📚 Read the serverless deployment guides: Access our docs on serverless usage, pricing, and rate limits.
- 💻 Try out additional sample notebooks: Use your Fireworks API key to explore more sample notebooks in our cookbook.
Step 4. Scale out with on-demand deployments
Fireworks’ on-demand deployments provide you with dedicated GPU instances, ensuring predictable performance and advanced customization options for your AI workloads. These deployments allow you to scale efficiently, optimize costs, and access exclusive models that aren’t available on serverless infrastructure.
Why choose on-demand deployments?
- 🏎️ Predictable performance: Enjoy consistent performance unaffected by other users’ workloads.
- 📈 Flexible scaling: Adjust replicas or GPU resources to handle varying workloads efficiently.
- ⚙️ Customization: Choose GPU types, enable features like long-context support, and apply quantization to optimize costs.
- 🔓 Expanded access: Deploy larger models or custom models from Hugging Face files.
- 💰 Cost optimization: Save more with reserved capacity when you have high utilization needs.
Key features of on-demand deployments
- 🔄 Replica scaling: Automatically adjust replicas to handle workload changes.
- 🖥️ Hardware options: Choose GPUs like NVIDIA H100, NVIDIA A100, or AMD MI300X to match your performance and budget needs. Check the Regions Guide for availability.
- ⚡ Quantization: Use FP8 or other precision settings to improve speed and reduce costs while keeping accuracy high. See the Quantization Guide.
Action items
- 🔖 Understand the benefits of on-demand versus serverless: Learn about the full range of deployment options and how to customize them to your needs.
- 📚 Explore optimization techniques: Learn how caching, quantization, and speculative decoding can improve performance and reduce costs.
- ❓ Check out our FAQs: Find answers to common questions about account management, support services, and on-demand deployment infrastructure.
Step 5. Building Compound AI systems
Expand your AI capabilities by incorporating advanced features like Compound AI, function calling, or retrieval-augmented generation (RAG). These tools enable you to build sophisticated applications that integrate seamlessly with external systems. For greater control, consider on-prem or BYOC deployments.
With Fireworks, you can:
- 🛠️ Leverage advanced features: Build Compound AI systems with function calling, RAG, and agents (Advanced Features).
- 🔗 Integrate external tools: Connect models with APIs, databases, or other services to enhance functionality.
- 🔍 Optimize workflows: Use Fireworks’ advanced tools to streamline AI development, enhance system efficiency, and scale complex applications with ease.
Action items
-
📚 Learn about Compound AI and Advanced Features: Explore richer functionality to create more sophisticated applications.
- Fireworks Compound AI System: With f1, experience how specialized models work together to deliver groundbreaking performance, efficiency, and advanced reasoning capabilities.
- Document inlining: Make any LLM capable of processing documents for seamless retrieval, summarization, and comprehension.
- Multimodal enterprise: See how Fireworks integrates text, image, and audio models to power enterprise-grade multimodal AI solutions.
- Multi-LoRA fine-tuning: Learn how Multi-LoRA fine-tuning enables precise model customization across diverse datasets.
- Audio transcription launch: Explore Fireworks’ state-of-the-art audio transcription models for fast and accurate speech-to-text applications.
-
📞 Contact us for enterprise solutions: Have complex requirements or need reserved capacity? Reach out to our team to discuss tailored solutions for your organization.
🌟 Dive deeper into the docs
Ready to learn more? Continue exploring the Fireworks documentation to uncover specific tools, workflows, and advanced features that can help you take your AI systems to the next level.
Was this page helpful?