Skip to main content

Reward Kit Examples

This page provides an overview of and links to documentation for various examples demonstrating the capabilities of the Reward Kit. All documentation for these examples is self-contained within the docs/ folder. Many examples use Hydra for configuration. Please refer to the specific documentation page for each example for execution instructions.

Available Examples

  • Accuracy Length Example:
  • APPS Coding Example:
    • Illustrates evaluation for coding problems from the APPS dataset.
    • View Documentation (New file to be created)
  • E2B (Code Execution Sandbox) Examples:
    • Covers various E2B integration scenarios for sandboxed code execution.
    • View Documentation
  • GCP Cloud Run Deployment Example:
    • Shows how to deploy a Reward Kit application on GCP Cloud Run.
    • View Documentation (New file to be created)
  • Math Example (GSM8K):
  • Math with Formatting Example:
    • Extends math evaluation to handle specific formatting requirements.
    • View Documentation (New file to be created)
  • Tool Calling Example:
    • Demonstrates evaluation for models that use tools or function calls.
    • View Documentation (New file to be created)
  • TRL Integration Example:
Note: The examples/metrics/ and examples/test_tasks/ directories in the root examples/ folder contain supporting resources and are not standalone documented examples here.

General Guides for Examples

While the pages above cover specific examples, these general guides might also be useful:

Next Steps

I