Json schema validation
JSON Schema Validation
This guide demonstrates how to validate JSON outputs from LLM responses against a defined schema.
Overview
The JSON Schema reward functions allow you to:
- Extract JSON data from LLM responses
- Validate it against a predefined JSON Schema
- Get detailed validation metrics and error reports
- Score models based on schema adherence
Prerequisites
Before using the JSON Schema validation rewards, ensure you have:
- Python 3.8+ installed on your system
- Reward Kit installed:
pip install reward-kit
- jsonschema package (installed automatically with Reward Kit)
Basic Usage
Here’s a simple example of how to use the JSON Schema validation:
This includes all the requested fields with sample data.""" } ]
Validate the JSON against the schema
result = json_schema_reward( messages=messages, schema=person_schema )
Print the results
print(f”Overall score: ”) print(“Metrics:”) for name, metric in result.metrics.items(): print(f” : ”) print(f” ”)
The data has been formatted according to the requirements."""} ]
Use custom extractor
result = json_schema_reward( messages=messages, schema=person_schema, json_extractor=my_json_extractor )
Pet:
Both objects follow the standard format."""} ]
Validate only the first JSON object
result = json_schema_reward( messages=messages, schema=person_schema, json_index=0 # 0-based index for which JSON object to validate )
Multiple Schema Requirements
For more complex requirements, you can specify an array of valid schemas:
This follows the admin schema with the required permissions."""} ]
Validate against multiple schemas (passes if valid against any schema)
result = json_schema_reward( messages=messages, schema=schemas, require_all_valid=False # Only need to be valid against one schema )
Best Practices
- Clear Schemas: Define schemas with precise types and constraints
- Required Fields: Explicitly specify which fields are required
- Helpful Error Messages: Include good descriptions in schema for better error messages
- Nested Validation: Use nested schemas for complex data structures
- Alternative Schemas: Consider using anyOf/oneOf for flexible validation
- Test with Examples: Validate schema against known good and bad examples
Limitations
- Cannot evaluate the quality or usefulness of the content, only its structure
- Requires properly formatted JSON to validate
- Some aspects of data quality (like whether values are reasonable) may require custom checks
Next Steps
- Learn about Function Calling Evaluation for validating function calls
- Explore Code Execution Evaluation for evaluating code solutions
- See Creating Custom Reward Functions to build custom validation logic