Evaluation workflows
Evaluation Workflows
This guide explains the complete lifecycle of a reward function, from local development and testing to deployment on the Fireworks platform.
Development Workflow Overview
The typical workflow for developing and deploying reward functions involves:
- Local Development: Writing and testing reward functions locally
- Preview Evaluation: Testing with sample data to validate performance
- Deployment: Making the reward function available for training workflows
- Integration: Using the deployed evaluator in RLHF training
1. Local Development
Creating a Reward Function
Start by creating a reward function with the @reward_function
decorator:
Local Testing
Test your reward function with sample messages:
Creating a Test File
For more comprehensive testing, create a separate test script:
2. Preview Evaluation
Creating Sample Data
Create a JSONL file with sample conversations for evaluation:
Using the CLI for Preview
Use the Reward Kit CLI to preview your evaluation:
Programmatic Preview
Alternatively, use the API for programmatic preview:
3. Deployment
Direct Deployment from Function
You can deploy the reward function directly:
Using the CLI for Deployment
Or use the CLI to deploy the function:
Custom Provider Deployment
Deploy with a specific model provider:
Using create_evaluation Function
You can also use the create_evaluation
function directly:
4. Integration with Training
Using in an RL Training Job
Once deployed, use the evaluator in an RL training job:
Programmatic Integration with TRL
For programmatic integration with the Transformer Reinforcement Learning (TRL) library:
Best Practices
- Iterative Development: Start simple, test thoroughly, and refine your reward function
- Version Control: Use version control for your reward functions and track changes
- Sample Diversity: Test with a diverse set of samples to ensure robustness
- Documentation: Document the behavior and assumptions of your reward function
- Error Handling: Include robust error handling to prevent evaluation failures
- Logging: Add detailed logging for debugging and monitoring
Next Steps
Now that you understand the complete workflow:
- Try creating a Basic Reward Function
- Explore Advanced Reward Functions with multiple metrics
- Learn about Best Practices for designing effective reward functions