When using Fireworks.ai through Hugging Face, you have two options for authentication:
Direct Requests: Use your Fireworks.ai API key in your Hugging Face user account settings. In this mode, inference requests are sent directly to Fireworks.ai, and billing is handled by your Fireworks.ai account.
Routed Requests: If you don’t configure a Fireworks.ai API key, your requests will be routed through Hugging Face. In this case, you can use a Hugging Face token for authentication. Billing for routed requests is applied to your Hugging Face account at standard provider API rates. You don’t need an account on Fireworks.ai to do this, just use your HF one!
To add a Fireworks.ai API key to your Hugging Face settings, follow these steps:
The examples below demonstrate how to interact with various models using Python and JavaScript.First, ensure you have the huggingface_hub library installed (version v0.29.0 or later):
pip install huggingface_hub>=0.29.0
Chat Completion (LLMs) with Hugging Face Hub library
from huggingface_hub import InferenceClient# Initialize the InferenceClient with Fireworks.ai as the providerclient = InferenceClient( provider="fireworks-ai", api_key="xxxxxxxxxxxxxxxxxxxxxxxx" # Replace with your API key (HF or custom))# Define the chat messagesmessages = [ { "role": "user", "content": "What is the capital of France?" }]# Generate a chat completioncompletion = client.chat.completions.create( model="deepseek-ai/DeepSeek-R1", messages=messages, max_tokens=500)# Print the responseprint(completion.choices[0].message)
You can swap this for any compatible LLM from Fireworks.ai, here’s a handy URL to find the list: here
Vision Language Models (VLMs) with Hugging Face Hub Library
Similar to LLMs, you can use any compatible VLM model from the list hereYou can also call inference providers via the OpenAI python client. You will need to specify the base_url and model parameters in the client and call respectively.The easiest way is to go to a model’s page on the hub and copy the snippet.
from openai import OpenAIclient = OpenAI( base_url="https://router.huggingface.co/fireworks-ai/inference/v1", api_key="xxxxxxxxxxxxxxxxxxxxxxxx" #fireworks or Hugging Face API key )messages = [ { "role": "user", "content": "What is the capital of France?" }]completion = client.chat.completions.create( model="<provider_specific_model_name>", messages=messages, max_tokens=500,)print(completion.choices[0].message)
Text-to-Image Generation
import osfrom huggingface_hub import InferenceClientclient = InferenceClient( provider="fireworks-ai", api_key=os.environ["HF_TOKEN"],)generated_image = client.text_to_image( model="black-forest-labs/FLUX.1-schnell", inputs="Bob Marley in the style of a painting by Johannes Vermeer", provider="fireworks-ai", max_tokens=500)
You can search for all Fireworks.ai models on the hub and directly try out the available models via the Model Page widget too.We’ll continue to increase the number of models and ways to try it out!
Was this page helpful?
⌘I
Assistant
Responses are generated using AI and may contain mistakes.