reasoning_content field. This field contains the
modelβs internal reasoning, which would otherwise appear in <think></think>
tags within the content field. For some models, the reasoning content may
instead be included directly in the content field itself.
Prerequisites
We recommend using the Fireworks Python SDK to work with reasoning, as it supports Fireworks-specific parameters and response fields.The SDK is currently in alpha. Use the
--pre flag when installing to get the latest version.Copy
Ask AI
pip install --pre fireworks-ai
Basic usage
Select a reasoning model from our serverless model library.Copy
Ask AI
from fireworks import Fireworks
client = Fireworks()
completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 25 * 37?",
}
],
model="accounts/fireworks/models/<reasoning-model>",
)
for choice in completion.choices:
# Access the reasoning content (thinking process)
if choice.message.reasoning_content:
print("Reasoning:", choice.message.reasoning_content)
print("Answer:", choice.message.content)
Controlling reasoning effort
You can control the reasoning token length using thereasoning_effort parameter:
Copy
Ask AI
completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Solve this step by step: If a train travels at 60 mph for 2.5 hours, how far does it go?",
}
],
model="accounts/fireworks/models/<reasoning-model>",
reasoning_effort="medium",
)
Streaming with reasoning content
When streaming, the reasoning content is available in each chunkβs delta:Copy
Ask AI
from fireworks import Fireworks
client = Fireworks()
stream = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the square root of 144?",
}
],
model="accounts/fireworks/models/<reasoning-model>",
reasoning_effort="medium",
stream=True,
)
reasoning_parts = []
content_parts = []
for chunk in stream:
delta = chunk.choices[0].delta
if delta.reasoning_content:
reasoning_parts.append(delta.reasoning_content)
if delta.content:
content_parts.append(delta.content)
print("Reasoning:", "".join(reasoning_parts))
print("Answer:", "".join(content_parts))
Interleaved thinking
When building multi-turn tool-calling agents with models that support interleaved thinking, you must include thereasoning_content from previous
assistant turns in subsequent requests. This enables the model to think between tool calls and after receiving tool results, allowing for more complex, step-by-step reasoning.

- Pass the
Messageobject directly (recommended) - The SDK message object already contains thereasoning_contentfield alongsidecontentandtool_calls - Manually include
reasoning_content- When constructing messages as dictionaries, explicitly add thereasoning_contentfield
Interleaved thinking is triggered when the last message in your API request
has
"role": "tool", enabling the model to use its previous reasoning process
when responding to the tool result. If a model does not support interleaved
thinking, it simply ignores the extra reasoning context so this pattern is
safe to use broadly.- Pass Message object
- Manual dictionary
Copy
Ask AI
# First turn: Get a response with reasoning_content
first_response = client.chat.completions.create(
messages=[{"role": "user", "content": "What is 15 + 27?"}],
model="accounts/fireworks/models/<reasoning-model>",
tools=tools,
)
# The assistant message contains reasoning_content, content, and tool_calls
assistant_message = first_response.choices[0].message
# assistant_message.reasoning_content -> "The user is asking for addition..."
# assistant_message.tool_calls -> [ToolCall(id="...", function=...)]
# Second turn: Pass the Message object directly
# This automatically includes reasoning_content alongside the message
second_response = client.chat.completions.create(
messages=[
{"role": "user", "content": "What is 15 + 27?"},
assistant_message, # Pass the complete Message object
{"role": "tool", "content": "42", "tool_call_id": assistant_message.tool_calls[0].id},
],
model="accounts/fireworks/models/<reasoning-model>",
tools=tools,
)
Copy
Ask AI
# First turn: Get a response with reasoning_content
first_response = client.chat.completions.create(
messages=[{"role": "user", "content": "What is 15 + 27?"}],
model="accounts/fireworks/models/<reasoning-model>",
tools=tools,
)
assistant_message = first_response.choices[0].message
# Second turn: Manually construct the assistant message dict
# Include reasoning_content explicitly alongside role, content, and tool_calls
second_response = client.chat.completions.create(
messages=[
{"role": "user", "content": "What is 15 + 27?"},
{
"role": "assistant",
"content": assistant_message.content,
"reasoning_content": assistant_message.reasoning_content, # Include reasoning
"tool_calls": assistant_message.tool_calls,
},
{"role": "tool", "content": "42", "tool_call_id": assistant_message.tool_calls[0].id},
],
model="accounts/fireworks/models/<reasoning-model>",
tools=tools,
)
If you construct the assistant message manually as a dictionary but omit the
reasoning_content field, the model will not have access to its previous
reasoning process.reasoning_content from the first turn is included in subsequent requests:
main.py
Copy
Ask AI
"""Test that reasoning_content is passed in multi-turn conversations.
This test proves that reasoning_content from previous turns is included
in subsequent requests by examining the raw prompt sent to the model.
"""
from fireworks import Fireworks
from dotenv import load_dotenv
load_dotenv()
client = Fireworks()
MODEL = "accounts/fireworks/models/kimi-k2-thinking"
# MODEL = "accounts/fireworks/models/minimax-m2"
# Define tools to enable interleaved thinking
tools = [
{
"type": "function",
"function": {
"name": "calculator",
"description": "Perform basic arithmetic operations",
"parameters": {
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"],
},
"a": {"type": "number"},
"b": {"type": "number"},
},
"required": ["operation", "a", "b"],
},
},
}
]
def print_header(title: str, char: str = "β", width: int = 60):
"""Print a formatted section header."""
print(f"\n{char * width}")
print(f" {title}")
print(f"{char * width}")
def print_field(label: str, value: str, indent: int = 2):
"""Print a labeled field with optional indentation."""
prefix = " " * indent
print(f"{prefix}{label}: {value}")
def print_multiline(label: str, content: str, max_preview: int = 200, indent: int = 2):
"""Print multiline content with a label and optional truncation."""
prefix = " " * indent
print(f"{prefix}{label}:")
preview = content[:max_preview] + "..." if len(content) > max_preview else content
for line in preview.split("\n"):
print(f"{prefix} β {line}")
# First turn - get a response with reasoning_content
print_header("FIRST TURN", "β")
first_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
}
],
model=MODEL,
tools=tools,
)
print_field("π Content", first_response.choices[0].message.content or "(none)")
reasoning = first_response.choices[0].message.reasoning_content
print_multiline("π Reasoning", reasoning)
# Print tool call (verified) from the first response
tool_calls = first_response.choices[0].message.tool_calls
assert tool_calls, "No tool calls in first response!"
print(f"\n π§ Tool Calls ({len(tool_calls)}):")
for i, tc in enumerate(tool_calls, 1):
print(f" [{i}] id={tc.id}")
print(f" function={tc.function.name}")
print(f" arguments={tc.function.arguments}")
tool_call_id = first_response.choices[0].message.tool_calls[0].id
# Verify we got reasoning_content
assert reasoning and len(reasoning) > 0, "No reasoning_content in first response!"
print("\n β First response has reasoning_content")
# Second turn - include the first assistant message
print_header("SECOND TURN", "β")
second_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
},
first_response.choices[0].message, # Includes reasoning_content
{"role": "tool", "content": "42", "tool_call_id": tool_call_id},
],
model=MODEL,
tools=tools,
raw_output=True,
)
print_field("π Answer", second_response.choices[0].message.content or "(none)")
# Extract and display the raw prompt that was sent to the model
raw_prompt = second_response.choices[0].raw_output.prompt_fragments[0]
print_header("RAW PROMPT SENT TO MODEL", "β")
print(raw_prompt)
# Check if reasoning_content from first turn is in the raw prompt
has_reasoning_content = reasoning[:50] in raw_prompt
print_header("RESULT", "β")
if has_reasoning_content:
print(" β
SUCCESS: reasoning_content IS included in subsequent requests!")
else:
print(" β FAILED: reasoning_content not found in raw prompt")
print()
Copy
Ask AI
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
FIRST TURN
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Content: (none)
π Reasoning:
β The user is asking for a simple addition calculation: 15 + 27.
β
β I should use the calculator function with:
β - operation: "add"
β - a: 15
β - b: 27
π§ Tool Calls (1):
[1] id=functions.calculator:0
function=calculator
arguments={"operation": "add", "a": 15, "b": 27}
β First response has reasoning_content
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
SECOND TURN
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Answer: 15 + 27 = 42
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RAW PROMPT SENT TO MODEL
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
<|im_system|>tool_declare<|im_middle|>[{"function":{"description":"Perform basic arithmetic operations","name":"calculator","parameters":{"properties":{"a":{"type":"number"},"b":{"type":"number"},"operation":{"enum":["add","subtract","multiply","divide"],"type":"string"}},"required":["operation","a","b"],"type":"object"}},"type":"function"}]<|im_end|><|im_user|>user<|im_middle|>What is 15 + 27?<|im_end|><|im_assistant|>assistant<|im_middle|><think>The user is asking for a simple addition calculation: 15 + 27.
I should use the calculator function with:
- operation: "add"
- a: 15
- b: 27</think><|tool_calls_section_begin|><|tool_call_begin|>functions.calculator:0<|tool_call_argument_begin|>{"operation": "add", "a": 15, "b": 27}<|tool_call_end|><|tool_calls_section_end|><|im_end|><|im_system|>tool<|im_middle|>## Return of None
42<|im_end|><|im_assistant|>assistant<|im_middle|>
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RESULT
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
SUCCESS: reasoning_content IS included in subsequent requests!
Preserved thinking
While interleaved thinking preserves reasoning within a single turn (across tool calls), preserved thinking extends this concept across multiple user turns. This allows the model to retain reasoning content from previous assistant turns in the conversation context, enabling more coherent multi-turn reasoning.
Controlling reasoning history
You can control how historical reasoning content is included in subsequent requests using thereasoning_history parameter:
Copy
Ask AI
completion = client.chat.completions.create(
messages=[
{"role": "user", "content": "What is 15 + 27?"},
assistant_message, # Contains reasoning_content from previous turn
{"role": "user", "content": "Now multiply that by 2"},
],
model="accounts/fireworks/models/<reasoning-model>",
reasoning_history="preserved", # Retain all previous reasoning content
)
reasoning_history parameter for all accepted values.
When
reasoning_history is set to "preserved", the model receives the full reasoning context from all previous turns. This is particularly useful for complex multi-turn conversations where maintaining reasoning continuity is important.reasoning_content from all previous turns is included in subsequent requests:
main.py
Copy
Ask AI
"""Test that reasoning_content is passed in multi-turn conversations.
This test proves that reasoning_content from previous turns is included
in subsequent requests by examining the raw prompt sent to the model.
"""
from fireworks import Fireworks
from dotenv import load_dotenv
load_dotenv()
client = Fireworks()
MODEL = "accounts/fireworks/models/glm-4p7"
# Define tools to enable interleaved thinking
tools = [
{
"type": "function",
"function": {
"name": "calculator",
"description": "Perform basic arithmetic operations",
"parameters": {
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"],
},
"a": {"type": "number"},
"b": {"type": "number"},
},
"required": ["operation", "a", "b"],
},
},
}
]
def print_header(title: str, char: str = "β", width: int = 60):
"""Print a formatted section header."""
print(f"\n{char * width}")
print(f" {title}")
print(f"{char * width}")
def print_field(label: str, value: str, indent: int = 2):
"""Print a labeled field with optional indentation."""
prefix = " " * indent
print(f"{prefix}{label}: {value}")
def print_multiline(label: str, content: str, max_preview: int = 200, indent: int = 2):
"""Print multiline content with a label and optional truncation."""
prefix = " " * indent
print(f"{prefix}{label}:")
preview = content[:max_preview] + "..." if len(content) > max_preview else content
for line in preview.split("\n"):
print(f"{prefix} β {line}")
# First turn - get a response with reasoning_content
print_header("FIRST TURN", "β")
first_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
}
],
model=MODEL,
tools=tools,
)
print_field("π Content", first_response.choices[0].message.content or "(none)")
reasoning = first_response.choices[0].message.reasoning_content
print_multiline("π Reasoning", reasoning)
# Print tool call (verified) from the first response
tool_calls = first_response.choices[0].message.tool_calls
assert tool_calls, "No tool calls in first response!"
print(f"\n π§ Tool Calls ({len(tool_calls)}):")
for i, tc in enumerate(tool_calls, 1):
print(f" [{i}] id={tc.id}")
print(f" function={tc.function.name}")
print(f" arguments={tc.function.arguments}")
tool_call_id = first_response.choices[0].message.tool_calls[0].id
# Verify we got reasoning_content
assert reasoning and len(reasoning) > 0, "No reasoning_content in first response!"
print("\n β First response has reasoning_content")
# Second turn - include the first assistant message
print_header("SECOND TURN", "β")
second_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
},
first_response.choices[0].message, # Includes reasoning_content
{"role": "tool", "content": "42", "tool_call_id": tool_call_id},
],
model=MODEL,
tools=tools,
raw_output=True,
)
print_field("π Answer", second_response.choices[0].message.content or "(none)")
# Extract and display the raw prompt that was sent to the model
raw_prompt = second_response.choices[0].raw_output.prompt_fragments[0]
print_header("RAW PROMPT SENT TO MODEL", "β")
print(raw_prompt)
# Check if reasoning_content from first turn is in the raw prompt
has_reasoning_content = reasoning[:50] in raw_prompt
print_header("RESULT", "β")
if has_reasoning_content:
print(" β
SUCCESS: reasoning_content IS included in subsequent requests!")
else:
print(" β FAILED: reasoning_content not found in raw prompt")
print()
# Third turn - ask for another calculation
print_header("THIRD TURN", "β")
third_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
},
first_response.choices[0].message, # Includes reasoning_content
{"role": "tool", "content": "42", "tool_call_id": tool_call_id},
{
"role": "user",
"content": "What is 20 * 5?",
},
],
model=MODEL,
tools=tools,
)
print_field("π Answer", third_response.choices[0].message.content or "(none)")
reasoning_third = third_response.choices[0].message.reasoning_content
print_multiline("π Reasoning", reasoning_third)
# Print tool call from the third response
tool_calls_third = third_response.choices[0].message.tool_calls
assert tool_calls_third, "No tool calls in third response!"
print(f"\n π§ Tool Calls ({len(tool_calls_third)}):")
for i, tc in enumerate(tool_calls_third, 1):
print(f" [{i}] id={tc.id}")
print(f" function={tc.function.name}")
print(f" arguments={tc.function.arguments}")
tool_call_id_third = third_response.choices[0].message.tool_calls[0].id
print()
# Fourth turn - include the third assistant message and tool response
print_header("FOURTH TURN", "β")
fourth_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
},
first_response.choices[0].message, # Includes reasoning_content
{"role": "tool", "content": "42", "tool_call_id": tool_call_id},
{
"role": "user",
"content": "What is 20 * 5?",
},
third_response.choices[0].message, # Includes reasoning_content from third turn
{"role": "tool", "content": "100", "tool_call_id": tool_call_id_third},
],
model=MODEL,
tools=tools,
raw_output=True,
reasoning_history="preserved",
)
print_field("π Answer", fourth_response.choices[0].message.content or "(none)")
# Extract and display the raw prompt that was sent to the model
raw_prompt_fourth = fourth_response.choices[0].raw_output.prompt_fragments[0]
print_header("RAW PROMPT SENT TO MODEL (FOURTH TURN)", "β")
print(raw_prompt_fourth)
# Check if reasoning_content from both first and third turns are in the raw prompt
has_reasoning_content_first = reasoning[:50] in raw_prompt_fourth
has_reasoning_content_third = reasoning_third[:50] in raw_prompt_fourth
print_header("RESULT (FOURTH TURN)", "β")
if has_reasoning_content_first and has_reasoning_content_third:
print(
" β
SUCCESS: reasoning_content from both first and third turns IS included in fourth turn requests!"
)
elif has_reasoning_content_first:
print(" β οΈ PARTIAL: Only first turn reasoning_content found in raw prompt")
elif has_reasoning_content_third:
print(" β οΈ PARTIAL: Only third turn reasoning_content found in raw prompt")
else:
print(" β FAILED: reasoning_content not found in raw prompt")
print()
GLM-4.7 (output)
Copy
Ask AI
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
FIRST TURN
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Content: (none)
π Reasoning:
β The user is asking for the sum of 15 and 27. This is a simple arithmetic calculation that I can perform using the calculator function. I need to:
β
β - Call the calculator function
β - Set operation to "ad...
π§ Tool Calls (1):
[1] id=call_6iGeMOel4nRUNw9B1UZlphUp
function=calculator
arguments={"operation": "add", "a": 15, "b": 27}
β First response has reasoning_content
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
SECOND TURN
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Answer: 15 + 27 = 42
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RAW PROMPT SENT TO MODEL
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
[gMASK]<sop><|system|>
# Tools
You may call one or more functions to assist with the user query.
You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "calculator", "description": "Perform basic arithmetic operations", "parameters": {"type": "object", "properties": {"operation": {"type": "string", "enum": ["add", "subtract", "multiply", "divide"]}, "a": {"type": "number"}, "b": {"type": "number"}}, "required": ["operation", "a", "b"]}}}
</tools>
For each function call, output the function name and arguments within the following XML format:
<tool_call>{function-name}<arg_key>{arg-key-1}</arg_key><arg_value>{arg-value-1}</arg_value><arg_key>{arg-key-2}</arg_key><arg_value>{arg-value-2}</arg_value>...</tool_call><|user|>What is 15 + 27?<|assistant|><think>The user is asking for the sum of 15 and 27. This is a simple arithmetic calculation that I can perform using the calculator function. I need to:
- Call the calculator function
- Set operation to "add"
- Set a to 15
- Set b to 27
Let me make the function call.</think><tool_call>calculator<arg_key>operation</arg_key><arg_value>add</arg_value><arg_key>a</arg_key><arg_value>15</arg_value><arg_key>b</arg_key><arg_value>27</arg_value></tool_call><|observation|><tool_response>42</tool_response><|assistant|><think>
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RESULT
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
SUCCESS: reasoning_content IS included in subsequent requests!
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
THIRD TURN
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Answer: (none)
π Reasoning:
β The user is asking for 20 * 5. I need to use the calculator function with:
β - operation: "multiply"
β - a: 20
β - b: 5
π§ Tool Calls (1):
[1] id=call_s3VSYmOARo0gZUl0S4pai8jk
function=calculator
arguments={"operation": "multiply", "a": 20, "b": 5}
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
FOURTH TURN
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Answer: 100
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RAW PROMPT SENT TO MODEL (FOURTH TURN)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
[gMASK]<sop><|system|>
# Tools
You may call one or more functions to assist with the user query.
You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "calculator", "description": "Perform basic arithmetic operations", "parameters": {"type": "object", "properties": {"operation": {"type": "string", "enum": ["add", "subtract", "multiply", "divide"]}, "a": {"type": "number"}, "b": {"type": "number"}}, "required": ["operation", "a", "b"]}}}
</tools>
For each function call, output the function name and arguments within the following XML format:
<tool_call>{function-name}<arg_key>{arg-key-1}</arg_key><arg_value>{arg-value-1}</arg_value><arg_key>{arg-key-2}</arg_key><arg_value>{arg-value-2}</arg_value>...</tool_call><|user|>What is 15 + 27?<|assistant|><think>The user is asking for the sum of 15 and 27. This is a simple arithmetic calculation that I can perform using the calculator function. I need to:
- Call the calculator function
- Set operation to "add"
- Set a to 15
- Set b to 27
Let me make the function call.</think><tool_call>calculator<arg_key>operation</arg_key><arg_value>add</arg_value><arg_key>a</arg_key><arg_value>15</arg_value><arg_key>b</arg_key><arg_value>27</arg_value></tool_call><|observation|><tool_response>42</tool_response><|user|>What is 20 * 5?<|assistant|><think>The user is asking for 20 * 5. I need to use the calculator function with:
- operation: "multiply"
- a: 20
- b: 5</think><tool_call>calculator<arg_key>operation</arg_key><arg_value>multiply</arg_value><arg_key>a</arg_key><arg_value>20</arg_value><arg_key>b</arg_key><arg_value>5</arg_value></tool_call><|observation|><tool_response>100</tool_response><|assistant|><think>
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RESULT (FOURTH TURN)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
SUCCESS: reasoning_content from both first and third turns IS included in fourth turn requests!