Skip to main content

Introduction

Welcome to the Fireworks onboarding guide! This guide is designed to help you quickly and effectively get started with the Fireworks platform, whether you’re a developer, researcher, or AI enthusiast. By following this step-by-step resource, you’ll learn how to explore and experiment with state-of-the-art AI models, prototype your ideas using Fireworks’ serverless infrastructure, and scale your projects with advanced on-demand deployments.

Who this guide is for

This guide is designed for new Fireworks users who are exploring the platform for the first time. It provides a hands-on introduction to the core features of Fireworks, including the model library, playgrounds, and on-demand deployments, all accessible through the web app. For experienced users, this guide serves as a starting point, with future resources planned to dive deeper into advanced tools like firectl and other intermediate features to enhance your workflow.

Objectives of the guide

  • Explore the Fireworks model library: Navigate and select generative AI models for text, image, and audio tasks.
  • Experiment with the playground: Test prompts, tweak parameters, and generate outputs in real time.
  • Prototype effortlessly: Use Fireworks’ serverless infrastructure to deploy and iterate without managing servers.
  • Scale your AI: Learn how on-demand deployments offer predictable performance and advanced customization.
  • Develop complex systems: Unlock advanced capabilities like Compound AI, function calling, and retrieval-augmented generation to create production-ready applications.
By the end of this guide, you’ll be equipped with the knowledge and tools to confidently use Fireworks to build, scale, and optimize AI-powered solutions. Let’s get started!

Step 1. Explore our model library

Fireworks provides a range of leading open-source models for tasks like text generation, code generation, and image understanding. With the Fireworks model library, you can choose from our wide range of popular LLMs, VLMs, LVMs, and audio models, such as: as well as embedding models from Nomic AI.
In this video, we introduce the Fireworks Model Library, your gateway to a diverse range of open-source and proprietary models designed for tasks like text generation, image understanding, and audio processing. Whether you’re a developer or a creative, Fireworks makes it easy to find and integrate the right tools for your generative AI needs.

What you’ll learn:

1️⃣ Navigating the model library: Browse popular models, filter by deployment type, and search for specific tools like Llama, Whisper, and Flux.
2️⃣ Customizing your experience: Use filters like “Serverless Models” to find models that fit your specific needs.
3️⃣ Seamless integration: Discover how Fireworks simplifies the process of discovering and managing AI models.
Developers building generative AI applications can interact with Fireworks in multiple ways:
  • 🌐 Via the web app: Access the Fireworks platform directly in your browser for easy model management.
  • 🐍 Through our Python SDK: Programmatically integrate and manage models within your codebase.
  • 🔗 With external providers: Pass your Fireworks API key to third-party tools for seamless workflow integration.
For additional documentation and guides, check out our Cookbook, which includes community-contributed notebooks and applications.

Action items


Step 2. Experiment using the model playground

The easiest way to get started with Fireworks and test models with minimal setup is through the Model Playground. Here, you can experiment with prompts, adjust parameters, and get immediate feedback on results before moving to more advanced steps. Take a closer look at how the LLM Playground lets you experiment with text-based models.
In this video, we explore the Fireworks Model Playground, the easiest way to experiment with LLMs, adjust parameters, and get instant feedback. Whether you’re crafting creative prompts, refining outputs, or testing model performance, the Playground is your go-to tool for seamless experimentation.

✨ What you’ll learn:

  • 🔍 Getting started: Access the Playground from the Model Library by selecting models like Llama 3.3 70B Instruct.
  • 📋 Model details: Discover key information, including starter code in Python, Typescript, Java, Go, and Shell for Chat and Completion modes.
  • 🎭 Running prompts: Test creative prompts like “Write a synopsis of the modern 2020 version of the Cats musical” and see instant results.
  • 🎛️ Parameter controls: Adjust settings like temperature and max tokens to refine outputs to your liking.
  • Completion mode: Explore latency and tokens-per-second metrics with prompts like “Write a synopsis of the modern 2020 Tarzan movie with Brendan Fraser.”
  • 💻 Code integration: Generate ready-to-use code snippets directly from the Playground for effortless integration into your projects.
Discover how the Image Playground transforms visual AI experimentation into an intuitive process.
In this video, we dive into the Fireworks Image Playground, where you can create stunning visuals, refine parameters, and explore the possibilities of AI-driven image generation. Perfect for developers, designers, and creators, the Image Playground is your gateway to experimenting with prompts and parameters for artistic and practical outputs.

✨ What you’ll learn:

  • ☑️ Getting started: Navigate the Model Library to find image models like FLUX.1 schnell FP8 and open them in the Model Playground.
  • ☑️ Crafting prompts: Use creative prompts like “Movie poster for a film set in a world where gravity doesn’t exist” and watch the model bring your vision to life.
  • ☑️ Adjusting parameters: Experiment with settings like Guidance Scale, Inference Steps, and Seed to refine and perfect your results.
  • ☑️ Exploring variants: Test different models, such as FLUX.1 dev FP8, for varied image quality and creative flexibility.
  • ☑️ Integrating code: Generate and view sample code in Python, Typescript, or Shell, complete with request parameters and response codes for seamless integration.
Experience how the Audio Playground empowers advanced audio transcription and translation tasks.
Welcome to Part 2C of our onboarding series! In this video, we explore the Fireworks Audio Playground, showcasing the incredible speed and accuracy of the Whisper Turbo models. Whether you’re transcribing, translating, or analyzing audio, Fireworks makes it easy to experiment and unlock the potential of advanced audio models.

✨ What you’ll learn:

  • 🎵 Real-world test case: Using the song Do You Hear the People Sing? from Les Misérables, featuring nine distinct languages and various English accents, to demonstrate transcription and translation capabilities.
  • 🔍 Navigating the model library: Find Whisper v3 Turbo and access its playground.
  • 📂 Uploading audio: Test the model with screen-recorded audio to ensure unbiased results without metadata influence.
  • Fast and accurate transcription: Observe Whisper Turbo’s ability to transcribe multilingual content at lightning speed and compare its output to the original lyrics.

🔑 Key features of the Audio Playground:

  • 🌍 Multilingual capabilities: Whisper Turbo excels in recognizing and transcribing multiple languages and dialects.
  • Incredible speed: Experience near-instant transcriptions for even complex audio files.
  • 🎛️ Interactive testing: Upload audio, tweak parameters, and explore transcription and translation features in real time.
Each model in the Playground includes the following features, designed to enhance your experimentation and streamline your workflow:
  • 🎛️ Parameter controls: Adjust settings like temperature and max tokens for LLMs or image-specific parameters (e.g., Guidance Scale) for image generation models. These controls allow you to fine-tune the behavior and outputs of the models, helping you achieve the desired results for different use cases.
  • 🧩 Code samples: Copy-paste ready-to-use code in Python, Typescript, or Shell to integrate models directly into your applications. This eliminates the guesswork of API implementation and speeds up development, so you can focus on building impactful solutions.
  • 🎨 Additional UI elements: Leverage interactive features like file upload buttons for image or audio inputs, making it easy to test multimodal capabilities without any additional setup. This ensures a smooth, hands-on testing experience, even for complex workflows.
  • 🔍 Model ID: Clearly displayed in the format account/fireworks/models/<model_name>, allowing you to switch between models effortlessly with a single line of code, making experimentation and integration faster and more efficient.

Action items


Step 3. Prototyping with serverless

Fireworks’ serverless infrastructure lets you quickly prototype AI models without managing servers or committing to long-term contracts. This setup supports fast experimentation and seamless scaling for your projects.

Why use Fireworks serverless?

  • 🚀 Launch instantly: Deploy apps with no setup or configuration required.
  • 🎯 Focus on prompt engineering: Design and refine your prompts without worrying about infrastructure.
  • ⚙️ Adjust parameters easily: Modify settings like temperature and max tokens to customize model outputs.
  • 💰 Pay-as-you-go: Only pay for what you use, with pricing based on parameter size buckets, making it cost-effective for projects of any size.
To start prototyping, you’ll need to obtain your API key, which allows you to interact with Fireworks’ serverless models programmatically.
In this video, we’ll guide you through generating your Fireworks API key, the first step to leveraging Fireworks’ serverless infrastructure. Prototype AI models with ease, scale seamlessly, and focus on building without worrying about managing servers.

✨ Why use Fireworks serverless?

  • 🚀 Launch instantly: Deploy apps with no setup or configuration required.
  • 🎯 Focus on prompt engineering: Refine your prompts without infrastructure headaches.
  • ⚙️ Adjust parameters easily: Tweak settings like temperature and max tokens to customize outputs.
  • 💰 Pay-as-you-go: Cost-effective pricing based on usage, perfect for projects of any size.

🛠️ How to get your API key:

1️⃣ Navigate to User Settings: Log in to your Fireworks account and click the profile icon.
2️⃣ Generate your key: Select ‘API Keys’ and click ‘Create API Key’ to generate your unique key.
3️⃣ Copy and secure: Save your API key securely—it’s essential for authentication.

Using your API key

Your API key is essential for securely accessing and managing your serverless deployments. Here’s how to use it:
  • Via the API: Include your API key in the headers of your RESTful API requests to integrate Fireworks’ models into your applications.
  • Using our SDK: Configure the Fireworks Python library with your API key to manage and deploy models programmatically.
  • Through third-party tools: Pass your API key to third-party clients (like LangChain) to incorporate Fireworks into your existing workflows, enabling you to use serverless models seamlessly.
Additionally, Fireworks is OpenAI compatible, enabling you to leverage familiar OpenAI tools and integrations within your Fireworks projects.
In this video, we’ll show you how to use your Fireworks API key to call serverless LLMs and effortlessly prototype with Fireworks’ serverless infrastructure. Whether you’re creating structured datasets or testing model outputs, Fireworks makes scaling your ideas simple—no servers required!

✨ What you’ll learn:

  • 📖 Accessing the Cookbook: Explore Fireworks’ GitHub repo and open example notebooks like “Llama 3.1 Synthetic Data Generation” in Colab.
  • 🔑 Using your API key: Learn how to securely generate and use your Fireworks API key for authentication.
  • 🤖 Interacting with models: Call Llama 3.1 models to generate structured synthetic data and customize outputs.
  • 🎯 Prompt engineering in action: See how to craft prompts to generate JSON-structured quiz questions with context, responses, and metadata.
Watch as we:
  • 📍 Generate geography quiz questions: Using Llama 3.1 405B for structured outputs.
  • 💾 Save data: Store structured data in JSONL format for project use.
  • Showcase flexibility: Highlight how Fireworks supports dataset creation, testing, and more.

Action items


Step 4. Scale out with on-demand deployments

Fireworks’ on-demand deployments provide you with dedicated GPU instances, ensuring predictable performance and advanced customization options for your AI workloads. These deployments allow you to scale efficiently, optimize costs, and access exclusive models that aren’t available on serverless infrastructure.

Why choose on-demand deployments?

  • 🏎️ Predictable performance: Enjoy consistent performance unaffected by other users’ workloads.
  • 📈 Flexible scaling: Adjust replicas or GPU resources to handle varying workloads efficiently.
  • ⚙️ Customization: Choose GPU types, enable features like long-context support, and apply quantization to optimize costs.
  • 🔓 Expanded access: Deploy larger models or custom models from Hugging Face files.
  • 💰 Cost optimization: Save more with reserved capacity when you have high utilization needs.

Key features of on-demand deployments

Action items


Step 5. Building Compound AI systems

Expand your AI capabilities by incorporating advanced features like Compound AI, function calling, or retrieval-augmented generation (RAG). These tools enable you to build sophisticated applications that integrate seamlessly with external systems. For greater control, consider on-prem or BYOC deployments.

With Fireworks, you can:

  • 🛠️ Leverage advanced features: Build Compound AI systems with function calling, RAG, and agents (Advanced Features).
  • 🔗 Integrate external tools: Connect models with APIs, databases, or other services to enhance functionality.
  • 🔍 Optimize workflows: Use Fireworks’ advanced tools to streamline AI development, enhance system efficiency, and scale complex applications with ease.

Action items

  • 📚 Learn about Compound AI and Advanced Features: Explore richer functionality to create more sophisticated applications.
    • Fireworks Compound AI System: With f1, experience how specialized models work together to deliver groundbreaking performance, efficiency, and advanced reasoning capabilities.
    • Multimodal enterprise: See how Fireworks integrates text, image, and audio models to power enterprise-grade multimodal AI solutions.
    • Multi-LoRA fine-tuning: Learn how Multi-LoRA fine-tuning enables precise model customization across diverse datasets.
    • Audio transcription launch: Explore Fireworks’ state-of-the-art audio transcription models for fast and accurate speech-to-text applications.
  • 📞 Contact us for enterprise solutions: Have complex requirements or need reserved capacity? Reach out to our team to discuss tailored solutions for your organization.

🌟 Dive deeper into the docs

Ready to learn more? Continue exploring the Fireworks documentation to uncover specific tools, workflows, and advanced features that can help you take your AI systems to the next level.
I