Fireworks.ai offers a powerful Response API that allows for more complex and stateful interactions with models. This guide will walk you through the key features and how to use them.
Overview
The Response API is designed for building conversational applications and complex workflows. It allows you to:
- Continue conversations: Maintain context across multiple turns without resending the entire history.
- Use external tools: Integrate with external services and data sources through the Model Context Protocol (MCP).
- Stream responses: Receive results as they are generated, enabling real-time applications.
Basic Usage
You can interact with the Response API using the Fireworks Python SDK or by making direct HTTP requests.
Creating a Response
To start a new conversation, you use the client.responses.create
method. For a complete example, see the getting started notebook.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.fireworks.ai/inference/v1",
api_key=os.getenv("FIREWORKS_API_KEY", "YOUR_FIREWORKS_API_KEY_HERE")
)
response = client.responses.create(
model="accounts/fireworks/models/qwen3-235b-a22b",
input="What is reward-kit and what are its 2 main features? Keep it short Please analyze the fw-ai-external/reward-kit repository.",
tools=[{"type": "sse", "server_url": "https://gitmcp.io/docs"}]
)
print(response.output[-1].content[0].text.split("</think>")[-1])
Continuing a Conversation with previous_response_id
To continue a conversation, you can use the previous_response_id
parameter. This tells the API to use the context from a previous response, so you don’t have to send the entire conversation history again. For a complete example, see the previous response ID notebook.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.fireworks.ai/inference/v1",
api_key=os.getenv("FIREWORKS_API_KEY", "YOUR_FIREWORKS_API_KEY_HERE")
)
# First, create an initial response
initial_response = client.responses.create(
model="accounts/fireworks/models/qwen3-235b-a22b",
input="What are the key features of reward-kit?",
tools=[{"type": "sse", "server_url": "https://gitmcp.io/docs"}]
)
initial_response_id = initial_response.id
# Now, continue the conversation
continuation_response = client.responses.create(
model="accounts/fireworks/models/qwen3-235b-a22b",
input="How do I install it?",
previous_response_id=initial_response_id,
tools=[{"type": "sse", "server_url": "https://gitmcp.io/docs"}]
)
print(continuation_response.output[-1].content[0].text.split("</think>")[-1])
Streaming Responses
For real-time applications, you can stream the response as it’s being generated. For a complete example, see the streaming example notebook.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.fireworks.ai/inference/v1",
api_key=os.getenv("FIREWORKS_API_KEY", "YOUR_FIREWORKS_API_KEY_HERE")
)
stream = client.responses.create(
model="accounts/fireworks/models/qwen3-235b-a22b",
input="give me 5 interesting facts on modelcontextprotocol/python-sdk -- keep it short!",
stream=True,
tools=[{"type": "mcp", "server_url": "https://mcp.deepwiki.com/mcp"}]
)
for chunk in stream:
print(chunk)
Cookbook Examples
For more in-depth examples, check out the following notebooks:
Storing Responses
By default, responses are stored and can be referenced by their ID. You can disable this by setting store=False
. If you do this, you will not be able to use the previous_response_id
to continue the conversation. For a complete example, see the store=False notebook.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.fireworks.ai/inference/v1",
api_key=os.getenv("FIREWORKS_API_KEY", "YOUR_FIREWORKS_API_KEY_HERE")
)
response = client.responses.create(
model="accounts/fireworks/models/qwen3-235b-a22b",
input="give me 5 interesting facts on modelcontextprotocol/python-sdk -- keep it short!",
store=False,
tools=[{"type": "mcp", "server_url": "https://mcp.deepwiki.com/mcp"}]
)
# This will fail because the previous response was not stored
try:
continuation_response = client.responses.create(
model="accounts/fireworks/models/qwen3-235b-a22b",
input="Explain the second fact in more detail.",
previous_response_id=response.id
)
except Exception as e:
print(e)