Responses API
Fireworks.ai offers a powerful Responses API that allows for more complex and stateful interactions with models. This guide will walk you through the key features and how to use them.
Overview
The Responses API is designed for building conversational applications and complex workflows. It allows you to:
- Continue conversations: Maintain context across multiple turns without resending the entire history.
- Use external tools: Integrate with external services and data sources through the Model Context Protocol (MCP).
- Stream responses: Receive results as they are generated, enabling real-time applications.
Basic Usage
You can interact with the Response API using the Fireworks Python SDK or by making direct HTTP requests.
Creating a Response
To start a new conversation, you use the client.responses.create
method. For a complete example, see the getting started notebook.
Continuing a Conversation with previous_response_id
To continue a conversation, you can use the previous_response_id
parameter. This tells the API to use the context from a previous response, so you don’t have to send the entire conversation history again. For a complete example, see the previous response ID notebook.
Streaming Responses
For real-time applications, you can stream the response as it’s being generated. For a complete example, see the streaming example notebook.
Cookbook Examples
For more in-depth examples, check out the following notebooks:
- General MCP Examples
- Using
previous_response_id
- Streaming Responses
- Using
store=False
- MCP with Streaming
Storing Responses
By default, responses are stored and can be referenced by their ID. You can disable this by setting store=False
. If you do this, you will not be able to use the previous_response_id
to continue the conversation. For a complete example, see the store=False notebook.