Introduction

Welcome to the Fireworks onboarding guide!

This guide is designed to help you quickly and effectively get started with the Fireworks platform, whether you’re a developer, researcher, or AI enthusiast. By following this step-by-step resource, you’ll learn how to explore and experiment with state-of-the-art AI models, prototype your ideas using Fireworks’ serverless infrastructure, and scale your projects with advanced on-demand deployments.

Who this guide is for

This guide is designed for new Fireworks users who are exploring the platform for the first time. It provides a hands-on introduction to the core features of Fireworks, including the model library, playgrounds, and on-demand deployments, all accessible through the web app.

For experienced users, this guide serves as a starting point, with future resources planned to dive deeper into advanced tools like firectl and other intermediate features to enhance your workflow.

Objectives of the guide

  • Explore the Fireworks model library: Navigate and select generative AI models for text, image, and audio tasks.
  • Experiment with the playground: Test prompts, tweak parameters, and generate outputs in real time.
  • Prototype effortlessly: Use Fireworks’ serverless infrastructure to deploy and iterate without managing servers.
  • Scale your AI: Learn how on-demand deployments offer predictable performance and advanced customization.
  • Develop complex systems: Unlock advanced capabilities like Compound AI, function calling, and retrieval-augmented generation to create production-ready applications.

By the end of this guide, you’ll be equipped with the knowledge and tools to confidently use Fireworks to build, scale, and optimize AI-powered solutions. Let’s get started!


Step 1. Explore our model library

Fireworks provides a range of leading open-source models for tasks like text generation, code generation, and image understanding.

With the Fireworks model library, you can choose from our wide range of popular LLMs, VLMs, LVMs, and audio models, such as:

as well as embedding models from Nomic AI.

Developers building generative AI applications can interact with Fireworks in multiple ways:

  • 🌐 Via the web app: Access the Fireworks platform directly in your browser for easy model management.
  • 🐍 Through our Python SDK: Programmatically integrate and manage models within your codebase.
  • 🔗 With external providers: Pass your Fireworks API key to third-party tools for seamless workflow integration.

For additional documentation and guides, check out our Cookbook, which includes community-contributed notebooks and applications.

Action items


Step 2. Experiment using the model playground

The easiest way to get started with Fireworks and test models with minimal setup is through the Model Playground. Here, you can experiment with prompts, adjust parameters, and get immediate feedback on results before moving to more advanced steps.

Take a closer look at how the LLM Playground lets you experiment with text-based models.

Discover how the Image Playground transforms visual AI experimentation into an intuitive process.

Experience how the Audio Playground empowers advanced audio transcription and translation tasks.

Each model in the Playground includes the following features, designed to enhance your experimentation and streamline your workflow:

  • 🎛️ Parameter controls: Adjust settings like temperature and max tokens for LLMs or image-specific parameters (e.g., Guidance Scale) for image generation models. These controls allow you to fine-tune the behavior and outputs of the models, helping you achieve the desired results for different use cases.

  • 🧩 Code samples: Copy-paste ready-to-use code in Python, Typescript, or Shell to integrate models directly into your applications. This eliminates the guesswork of API implementation and speeds up development, so you can focus on building impactful solutions.

  • 🎨 Additional UI elements: Leverage interactive features like file upload buttons for image or audio inputs, making it easy to test multimodal capabilities without any additional setup. This ensures a smooth, hands-on testing experience, even for complex workflows.

  • 🔍 Model ID: Clearly displayed in the format account/fireworks/models/<model_name>, allowing you to switch between models effortlessly with a single line of code, making experimentation and integration faster and more efficient.

Action items


Step 3. Prototyping with serverless

Fireworks’ serverless infrastructure lets you quickly prototype AI models without managing servers or committing to long-term contracts. This setup supports fast experimentation and seamless scaling for your projects.

Why use Fireworks serverless?

  • 🚀 Launch instantly: Deploy apps with no setup or configuration required.
  • 🎯 Focus on prompt engineering: Design and refine your prompts without worrying about infrastructure.
  • ⚙️ Adjust parameters easily: Modify settings like temperature and max tokens to customize model outputs.
  • 💰 Pay-as-you-go: Only pay for what you use, with pricing based on parameter size buckets, making it cost-effective for projects of any size.

To start prototyping, you’ll need to obtain your API key, which allows you to interact with Fireworks’ serverless models programmatically.

Using your API key

Your API key is essential for securely accessing and managing your serverless deployments. Here’s how to use it:

  • Via the API: Include your API key in the headers of your RESTful API requests to integrate Fireworks’ models into your applications.
  • Using our SDK: Configure the Fireworks Python library with your API key to manage and deploy models programmatically.
  • Through third-party tools: Pass your API key to third-party clients (like LangChain) to incorporate Fireworks into your existing workflows, enabling you to use serverless models seamlessly.

Additionally, Fireworks is OpenAI compatible, enabling you to leverage familiar OpenAI tools and integrations within your Fireworks projects.

Action items


Step 4. Scale out with on-demand deployments

Fireworks’ on-demand deployments provide you with dedicated GPU instances, ensuring predictable performance and advanced customization options for your AI workloads. These deployments allow you to scale efficiently, optimize costs, and access exclusive models that aren’t available on serverless infrastructure.

Why choose on-demand deployments?

  • 🏎️ Predictable performance: Enjoy consistent performance unaffected by other users’ workloads.
  • 📈 Flexible scaling: Adjust replicas or GPU resources to handle varying workloads efficiently.
  • ⚙️ Customization: Choose GPU types, enable features like long-context support, and apply quantization to optimize costs.
  • 🔓 Expanded access: Deploy larger models or custom models from Hugging Face files.
  • 💰 Cost optimization: Save more with reserved capacity when you have high utilization needs.

Key features of on-demand deployments

Action items


Step 5. Building Compound AI systems

Expand your AI capabilities by incorporating advanced features like Compound AI, function calling, or retrieval-augmented generation (RAG). These tools enable you to build sophisticated applications that integrate seamlessly with external systems. For greater control, consider on-prem or BYOC deployments.

With Fireworks, you can:

  • 🛠️ Leverage advanced features: Build Compound AI systems with function calling, RAG, and agents (Advanced Features).
  • 🔗 Integrate external tools: Connect models with APIs, databases, or other services to enhance functionality.
  • 🔍 Optimize workflows: Use Fireworks’ advanced tools to streamline AI development, enhance system efficiency, and scale complex applications with ease.

Action items

  • 📚 Learn about Compound AI and Advanced Features: Explore richer functionality to create more sophisticated applications.

    • Fireworks Compound AI System: With f1, experience how specialized models work together to deliver groundbreaking performance, efficiency, and advanced reasoning capabilities.
    • Document inlining: Make any LLM capable of processing documents for seamless retrieval, summarization, and comprehension.
    • Multimodal enterprise: See how Fireworks integrates text, image, and audio models to power enterprise-grade multimodal AI solutions.
    • Multi-LoRA fine-tuning: Learn how Multi-LoRA fine-tuning enables precise model customization across diverse datasets.
    • Audio transcription launch: Explore Fireworks’ state-of-the-art audio transcription models for fast and accurate speech-to-text applications.
  • 📞 Contact us for enterprise solutions: Have complex requirements or need reserved capacity? Reach out to our team to discuss tailored solutions for your organization.


🌟 Dive deeper into the docs

Ready to learn more? Continue exploring the Fireworks documentation to uncover specific tools, workflows, and advanced features that can help you take your AI systems to the next level.