Reward function anatomy
Reward Function Anatomy
This guide provides a detailed explanation of how reward functions are structured in the Reward Kit, focusing on the @reward_function
decorator and the components that make up a complete reward function.
The @reward_function
Decorator
The @reward_function
decorator is the core mechanism that transforms a regular Python function into a reward function that can be used for evaluation and deployment.
What the Decorator Does
The @reward_function
decorator performs several important functions:
- Input Validation: Ensures the function receives the expected parameters
- Output Standardization: Ensures the function returns a properly formatted
RewardOutput
object - Deployment Capability: Adds a
.deploy()
method to the function for easy deployment - Backward Compatibility: Handles legacy return formats (tuples of score and metrics)
Under the Hood
Internally, the decorator wraps your function with logic that:
- Processes the input parameters
- Calls your function with the standardized inputs
- Handles any exceptions that occur during execution
- Formats the output as a
RewardOutput
object - Provides deployment capabilities through the
.deploy()
method
Function Parameters
A standard reward function has these parameters:
Required Parameters
messages
: A list of message dictionaries in the conversation, where each message has at least"role"
and"content"
keys. The last message is typically the one being evaluated.
Optional Parameters
original_messages
: The conversation context, usually messages before the response being evaluated. If not provided, it defaults tomessages[:-1]
.**kwargs
: Additional parameters that can be used to customize the evaluation.
Return Value
A reward function must return a RewardOutput
object:
RewardOutput Structure
score
: The final aggregate score (typically between 0.0 and 1.0)metrics
: A dictionary of component metrics, each with its own score and explanation
Multi-Component Reward Functions
Complex reward functions often evaluate multiple aspects of a response:
Deployment Capabilities
The @reward_function
decorator adds a .deploy()
method to your function:
Deploy Method Parameters
name
: ID for the deployed evaluator (required)description
: Human-readable description (optional)force
: Whether to overwrite an existing evaluator with the same name (optional)providers
: List of model providers to use for evaluation (optional)
Error Handling
Robust reward functions include proper error handling:
Working with Metadata
You can pass additional configuration through the **kwargs
parameter:
When calling the function, you can pass this metadata:
Next Steps
Now that you understand the structure of reward functions:
- Learn about the Core Data Types used in reward functions
- Explore Evaluation Workflows for testing and deployment
- See Code Examples for practical implementations