Introduction

Welcome to the Fireworks onboarding guide!

This guide is designed to help you quickly and effectively get started with the Fireworks platform, whether you’re a developer, researcher, or AI enthusiast. By following this step-by-step resource, you’ll learn how to explore and experiment with state-of-the-art AI models, prototype your ideas using Fireworks’ serverless infrastructure, and scale your projects with advanced on-demand deployments.

Who this guide is for

This guide is designed for new Fireworks users who are exploring the platform for the first time. It provides a hands-on introduction to the core features of Fireworks, including the model library, playgrounds, and on-demand deployments, all accessible through the web app.

For experienced users, this guide serves as a starting point, with future resources planned to dive deeper into advanced tools like firectl and other intermediate features to enhance your workflow.

Objectives of the guide

Explore the Fireworks model library: Navigate and select generative AI models for text, image, and audio tasks.
Experiment with the playground: Test prompts, tweak parameters, and generate outputs in real time.
Prototype effortlessly: Use Fireworks’ serverless infrastructure to deploy and iterate without managing servers.
Scale your AI: Learn how on-demand deployments offer predictable performance and advanced customization.
Develop complex systems: Unlock advanced capabilities like Compound AI, function calling, and retrieval-augmented generation to create production-ready applications.

By the end of this guide, you’ll be equipped with the knowledge and tools to confidently use Fireworks to build, scale, and optimize AI-powered solutions. Let’s get started!

Step 1. Explore our model library

Fireworks provides a range of leading open-source models for tasks like text generation, code generation, and image understanding.

With the Fireworks model library, you can choose from our wide range of popular LLMs, VLMs, LVMs, and audio models, such as:

as well as embedding models from Nomic AI.

🎥 Part 1: Introducing the Model Library

Developers building generative AI applications can interact with Fireworks in multiple ways:

🌐 Via the web app: Access the Fireworks platform directly in your browser for easy model management.
🐍 Through our Python SDK: Programmatically integrate and manage models within your codebase.
🔗 With external providers: Pass your Fireworks API key to third-party tools for seamless workflow integration.

For additional documentation and guides, check out our Cookbook, which includes community-contributed notebooks and applications.

Action items

👀 Browse the model library: Explore our open and closed-source models.
📚 Read real-world use cases: See how customers are building production systems like:
👋 Join our Discord community: Connect and share your projects.

Step 2. Experiment using the model playground

The easiest way to get started with Fireworks and test models with minimal setup is through the Model Playground. Here, you can experiment with prompts, adjust parameters, and get immediate feedback on results before moving to more advanced steps.

Take a closer look at how the LLM Playground lets you experiment with text-based models.

🎥 Part 2A: Introducing the LLM playground

Discover how the Image Playground transforms visual AI experimentation into an intuitive process.

🎥 Part 2B: Introducing the Image Playground

Experience how the Audio Playground empowers advanced audio transcription and translation tasks.

🎥 Part 2C: Introducing the Audio Playground

Each model in the Playground includes the following features, designed to enhance your experimentation and streamline your workflow:

🎛️ Parameter controls: Adjust settings like temperature and max tokens for LLMs or image-specific parameters (e.g., Guidance Scale) for image generation models. These controls allow you to fine-tune the behavior and outputs of the models, helping you achieve the desired results for different use cases.
🧩 Code samples: Copy-paste ready-to-use code in Python, Typescript, or Shell to integrate models directly into your applications. This eliminates the guesswork of API implementation and speeds up development, so you can focus on building impactful solutions.
🎨 Additional UI elements: Leverage interactive features like file upload buttons for image or audio inputs, making it easy to test multimodal capabilities without any additional setup. This ensures a smooth, hands-on testing experience, even for complex workflows.
🔍 Model ID: Clearly displayed in the format account/fireworks/models/<model_name>, allowing you to switch between models effortlessly with a single line of code, making experimentation and integration faster and more efficient.

Action items

💻 🖱️ Sign into your account and explore various models, including:
- LLMs and VLMs: Llama 3.3 70B, Llama 3.2 90B Vision Instruct
- VLMs: FLUX.1 [dev] FP8
- Audio models: Whisper V3 Turbo
❓ Have questions, comments, or feedback? Head over to Discord and post in:
- #feature-requests
- #questions
- #bug-reports
📚 Check out sampling options: Review the sampling options for text models to see the parameters we currently support.

Step 3. Prototyping with serverless

Fireworks’ serverless infrastructure lets you quickly prototype AI models without managing servers or committing to long-term contracts. This setup supports fast experimentation and seamless scaling for your projects.

Why use Fireworks serverless?

🚀 Launch instantly: Deploy apps with no setup or configuration required.
🎯 Focus on prompt engineering: Design and refine your prompts without worrying about infrastructure.
⚙️ Adjust parameters easily: Modify settings like temperature and max tokens to customize model outputs.
💰 Pay-as-you-go: Only pay for what you use, with pricing based on parameter size buckets, making it cost-effective for projects of any size.

To start prototyping, you’ll need to obtain your API key, which allows you to interact with Fireworks’ serverless models programmatically.

🎥 Part 3A: Generating Your API Key

Using your API key

Your API key is essential for securely accessing and managing your serverless deployments. Here’s how to use it:

Via the API: Include your API key in the headers of your RESTful API requests to integrate Fireworks’ models into your applications.
Using our SDK: Configure the Fireworks Python library with your API key to manage and deploy models programmatically.
Through third-party tools: Pass your API key to third-party clients (like LangChain) to incorporate Fireworks into your existing workflows, enabling you to use serverless models seamlessly.

Additionally, Fireworks is OpenAI compatible, enabling you to leverage familiar OpenAI tools and integrations within your Fireworks projects.

🎥 Part 3B: Calling an LLM

Action items

🔑 Get your API key: Navigate to your account settings and generate your API key to authenticate your requests.
📓 Call a serverless model: See how you can call a serverless model using a sample notebook.
🔖 Read the API usage guide: Understand the different endpoints and parameters available for use in your projects.
📚 Read the serverless deployment guides: Access our docs on serverless usage, pricing, and rate limits.
💻 Try out additional sample notebooks: Use your Fireworks API key to explore more sample notebooks in our cookbook.

Step 4. Scale out with on-demand deployments

Fireworks’ on-demand deployments provide you with dedicated GPU instances, ensuring predictable performance and advanced customization options for your AI workloads. These deployments allow you to scale efficiently, optimize costs, and access exclusive models that aren’t available on serverless infrastructure.

Why choose on-demand deployments?

🏎️ Predictable performance: Enjoy consistent performance unaffected by other users’ workloads.
📈 Flexible scaling: Adjust replicas or GPU resources to handle varying workloads efficiently.
⚙️ Customization: Choose GPU types, enable features like long-context support, and apply quantization to optimize costs.
🔓 Expanded access: Deploy larger models or custom models from Hugging Face files.
💰 Cost optimization: Save more with reserved capacity when you have high utilization needs.

Key features of on-demand deployments

🔄 Replica scaling: Automatically adjust replicas to handle workload changes.
🖥️ Hardware options: Choose GPUs like NVIDIA H100, NVIDIA A100, or AMD MI300X to match your performance and budget needs. Check the Regions Guide for availability.
⚡ Quantization: Use FP8 or other precision settings to improve speed and reduce costs while keeping accuracy high. See the Quantization Guide.

Action items

🔖 Understand the benefits of on-demand versus serverless: Learn about the full range of deployment options and how to customize them to your needs.
📚 Explore optimization techniques: Learn how caching, quantization, and speculative decoding can improve performance and reduce costs.
❓ Check out our FAQs: Find answers to common questions about account management, support services, and on-demand deployment infrastructure.

Step 5. Building Compound AI systems

Expand your AI capabilities by incorporating advanced features like Compound AI, function calling, or retrieval-augmented generation (RAG). These tools enable you to build sophisticated applications that integrate seamlessly with external systems. For greater control, consider on-prem or BYOC deployments.

With Fireworks, you can:

🛠️ Leverage advanced features: Build Compound AI systems with function calling, RAG, and agents (Advanced Features).
🔗 Integrate external tools: Connect models with APIs, databases, or other services to enhance functionality.
🔍 Optimize workflows: Use Fireworks’ advanced tools to streamline AI development, enhance system efficiency, and scale complex applications with ease.

Action items

📚 Learn about Compound AI and Advanced Features: Explore richer functionality to create more sophisticated applications.
- Fireworks Compound AI System: With f1, experience how specialized models work together to deliver groundbreaking performance, efficiency, and advanced reasoning capabilities.
- Document inlining: Make any LLM capable of processing documents for seamless retrieval, summarization, and comprehension.
- Multimodal enterprise: See how Fireworks integrates text, image, and audio models to power enterprise-grade multimodal AI solutions.
- Multi-LoRA fine-tuning: Learn how Multi-LoRA fine-tuning enables precise model customization across diverse datasets.
- Audio transcription launch: Explore Fireworks’ state-of-the-art audio transcription models for fast and accurate speech-to-text applications.
📞 Contact us for enterprise solutions: Have complex requirements or need reserved capacity? Reach out to our team to discuss tailored solutions for your organization.

🌟 Dive deeper into the docs

Ready to learn more? Continue exploring the Fireworks documentation to uncover specific tools, workflows, and advanced features that can help you take your AI systems to the next level.

Get Started

Capabilities

Custom Deployments

Resources

Administration

Onboarding

Introduction

Who this guide is for

Objectives of the guide

Step 1. Explore our model library

What you’ll learn:

Action items

Step 2. Experiment using the model playground

✨ What you’ll learn:

✨ What you’ll learn:

✨ What you’ll learn:

🔑 Key features of the Audio Playground:

Action items

Step 3. Prototyping with serverless

Why use Fireworks serverless?

✨ Why use Fireworks serverless?

🛠️ How to get your API key:

Using your API key

✨ What you’ll learn:

🌟 Featured example:

Action items

Step 4. Scale out with on-demand deployments

Why choose on-demand deployments?

Key features of on-demand deployments

Action items

Step 5. Building Compound AI systems

With Fireworks, you can:

Action items

🌟 Dive deeper into the docs

Get Started

Capabilities

Custom Deployments

Resources

Administration

​Introduction

​Who this guide is for

​Objectives of the guide

​Step 1. Explore our model library

​Action items

​Step 2. Experiment using the model playground

​Action items

​Step 3. Prototyping with serverless

​Why use Fireworks serverless?

​Using your API key

​Action items

​Step 4. Scale out with on-demand deployments

​Why choose on-demand deployments?

​Key features of on-demand deployments

​Action items

​Step 5. Building Compound AI systems

​With Fireworks, you can:

​Action items

​🌟 Dive deeper into the docs

Introduction

Who this guide is for

Objectives of the guide

Step 1. Explore our model library

Action items

Step 2. Experiment using the model playground

Action items

Step 3. Prototyping with serverless

Why use Fireworks serverless?

Using your API key

Action items

Step 4. Scale out with on-demand deployments

Why choose on-demand deployments?

Key features of on-demand deployments

Action items

Step 5. Building Compound AI systems

With Fireworks, you can:

Action items

🌟 Dive deeper into the docs