Prepare dataset
input
field containing a messages
array, where each message is an object with two fields:
role
: one of system
, user
, or assistant
content
: a string representing the message contentpreferred_output
field containing an assistant message with an ideal responsenon_preferred_output
field containing an assistant message with a suboptimal responseeinstein_dpo.jsonl
.Create and upload the dataset
firectl
, Restful API
, builder SDK
or UI
.Create Dataset
and follow the wizard.
UI
is more suitable for smaller datasets < 500MB
while firectl
might work better for bigger datasets.Ensure the dataset ID conforms to the resource id restrictions.Create a DPO Job
firectl
to create a new DPO job:Monitor the DPO Job
firectl
to monitor progress updates for the DPO fine-tuning job.STATE
will be set to JOB_STATE_COMPLETED
, and the fine-tuned model can be deployed.Deploy the DPO fine-tuned model
Python builder SDK
references
Restful API
references
firectl
references