reward_function Decorator Reference
The @reward_function
decorator transforms a regular Python function into a reward function with standardized inputs/outputs and deployment capabilities.
Overview
The decorator serves several key purposes:
- Ensures consistent input and output formats
- Adds error handling and validation
- Provides a
.deploy()
method for deploying the function to Fireworks
Import
from reward_kit import reward_function
Usage
@reward_function
def my_reward_function(
messages: List[Dict[str, str]],
original_messages: Optional[List[Dict[str, str]]] = None,
**kwargs
) -> RewardOutput:
# Your evaluation logic here
score = 0.75 # Example score
return RewardOutput(score=score, metrics={...})
Parameter Requirements
Functions decorated with @reward_function
should accept the following parameters:
-
messages
(List[Dict[str, str]]
): Required. List of conversation messages, with the last message typically being the one evaluated.
-
original_messages
(Optional[List[Dict[str, str]]]
): Optional. The conversation context, without the message being evaluated.
-
**kwargs
: Optional. Additional parameters (like metadata) that can be passed to the function.
Return Value Requirements
Functions must return a RewardOutput
object or a compatible tuple format:
# Preferred return format
return RewardOutput(
score=0.75, # Overall score
metrics={
"clarity": MetricRewardOutput(score=0.8, reason="Good clarity"),
"accuracy": MetricRewardOutput(score=0.7, reason="Minor errors")
}
)
# Legacy tuple format (also supported)
return 0.75, {"clarity": 0.8, "accuracy": 0.7}
Added Methods
.deploy()
The decorator adds a .deploy()
method to the function, allowing it to be deployed to Fireworks.
evaluation_id = my_reward_function.deploy(
name="my-evaluator",
description="Evaluates responses based on clarity and accuracy",
account_id=None, # Optional, defaults to configured account
auth_token=None, # Optional, defaults to configured token
force=False, # Set to True to overwrite if it already exists
providers=None # Optional model providers configuration
)
Parameters
-
name
(str
): Required. ID for the deployed evaluator.
-
description
(str
): Optional. Human-readable description of the evaluator.
-
account_id
(Optional[str]
): Optional. Fireworks account ID. If not provided, will be read from config or environment.
-
auth_token
(Optional[str]
): Optional. Authentication token. If not provided, will be read from config or environment.
-
force
(bool
): Optional. Whether to overwrite an existing evaluator with the same name. Default is False.
-
providers
(Optional[List[Dict[str, str]]]
): Optional. List of provider configurations. If not provided, uses a default provider.
Returns
str
: The evaluation ID that can be used in RL training.
Exceptions
ValueError
: Raised if authentication fails or required parameters are missing.
requests.exceptions.HTTPError
: Raised if the API request fails.
Implementation Details
Validation Logic
The decorator performs the following validations:
- Ensures the decorated function has the expected parameters
- Validates that the return value is a
RewardOutput
or a compatible tuple
- Handles exceptions that occur during function execution
Backward Compatibility
For backward compatibility, the decorator supports the legacy tuple return format:
return score, component_scores_dict
This gets automatically converted to a RewardOutput
object.
Deployment Process
When .deploy()
is called, the decorator:
- Extracts the function’s source code
- Creates a wrapper that handles the Fireworks evaluation format
- Creates a temporary directory with the wrapped function
- Uploads and registers the function with the Fireworks API
Examples
Basic Usage
from reward_kit import reward_function, RewardOutput, MetricRewardOutput
from typing import List, Dict, Optional
@reward_function
def word_count_reward(
messages: List[Dict[str, str]],
original_messages: Optional[List[Dict[str, str]]] = None,
**kwargs
) -> RewardOutput:
"""Evaluate response based on word count."""
response = messages[-1].get("content", "")
word_count = len(response.split())
score = min(word_count / 100, 1.0)
return RewardOutput(
score=score,
metrics={
"word_count": MetricRewardOutput(
score=score,
reason=f"Word count: {word_count}"
)
}
)
@reward_function
def configurable_reward(
messages: List[Dict[str, str]],
original_messages: Optional[List[Dict[str, str]]] = None,
metadata: Optional[Dict[str, any]] = None,
**kwargs
) -> RewardOutput:
"""Reward function that accepts configuration via metadata."""
metadata = metadata or {}
# Get threshold from metadata or use default
threshold = metadata.get("threshold", 50)
response = messages[-1].get("content", "")
word_count = len(response.split())
score = min(word_count / threshold, 1.0)
return RewardOutput(
score=score,
metrics={
"configured_word_count": MetricRewardOutput(
score=score,
reason=f"Word count: {word_count}, threshold: {threshold}"
)
}
)
Deploying a Reward Function
# Define and decorate the reward function
@reward_function
def clarity_reward(messages, original_messages=None, **kwargs):
# ... evaluation logic ...
return RewardOutput(score=0.8, metrics={...})
# Deploy the function to Fireworks
evaluation_id = clarity_reward.deploy(
name="clarity-evaluator",
description="Evaluates the clarity of responses",
force=True # Overwrite if it already exists
)
print(f"Deployed evaluator with ID: {evaluation_id}")
Using with Custom Providers
# Deploy with a specific model provider
evaluation_id = my_reward_function.deploy(
name="my-evaluator-anthropic",
description="My evaluator using Claude model",
force=True,
providers=[
{
"providerType": "anthropic",
"modelId": "claude-3-sonnet-20240229"
}
]
)