Reward Kit Examples

This page provides an overview of and links to documentation for various examples demonstrating the capabilities of the Reward Kit. All documentation for these examples is self-contained within the docs/ folder.

Many examples use Hydra for configuration. Please refer to the specific documentation page for each example for execution instructions.

Available Examples

  • Accuracy Length Example:

  • APPS Coding Example:

    • Illustrates evaluation for coding problems from the APPS dataset.
    • View Documentation (New file to be created)
  • E2B (Code Execution Sandbox) Examples:

    • Covers various E2B integration scenarios for sandboxed code execution.
    • View Documentation
  • GCP Cloud Run Deployment Example:

    • Shows how to deploy a Reward Kit application on GCP Cloud Run.
    • View Documentation (New file to be created)
  • Math Example (GSM8K):

  • Math with Formatting Example:

    • Extends math evaluation to handle specific formatting requirements.
    • View Documentation (New file to be created)
  • Tool Calling Example:

    • Demonstrates evaluation for models that use tools or function calls.
    • View Documentation (New file to be created)
  • TRL Integration Example:

Note: The examples/metrics/ and examples/test_tasks/ directories in the root examples/ folder contain supporting resources and are not standalone documented examples here.

General Guides for Examples

While the pages above cover specific examples, these general guides might also be useful:

Next Steps