What this is
RLOR trainer jobs and weight-sync-enabled deployments hold GPU resources. Always clean up after experiments — especially if jobs terminate unexpectedly. In new SDK and cookbook code, cleanup is owned by the SDK-managed service client.Automatic cleanup via the SDK-managed service
Create the service with cleanup options, then close it infinally:
cleanup_trainer_on_close=True deletes SDK-managed trainers. Separate reference trainers are governed by cleanup_reference_trainer_on_close (default True). cleanup_deployment_on_close="scale_to_zero" releases deployment GPUs while keeping the deployment resource around; use "delete" only when you want to remove the deployment entirely.
Cookbook recipes use the same service-client lifecycle internally and close the service through an ExitStack.
The standalone
ResourceCleanup context manager and setup_infra helper have been removed from the cookbook. Provisioning and teardown now live behind the SDK-managed service client. See Migrating from the deprecated managed infra.Trainer inactivity cleanup
Long-running RLOR trainer jobs are automatically stopped after 60 minutes with no tracked activity. The trainer reports this activity to the control plane, and tracked activity includes trainer API operations and active-session heartbeats. When creating a trainer through the REST API (POST /v1/accounts/{account_id}/rlorTrainerJobs), set inactivityTimeout to a positive protobuf JSON duration to choose a different timeout:
TrainerJobConfig.inactivity_timeout and pass the config to TrainerJobManager.create(...) or TrainerJobManager.create_and_wait(...):
firectl, use --inactivity-timeout 30m or --inactivity-timeout 2h. When the value is omitted or set to 0, Fireworks uses the 60-minute default.
To disable automatic inactivity cleanup, set disableInactivityCleanup in the REST API, set TrainerJobConfig.disable_inactivity_cleanup=True in the Training SDK, or pass --disable-inactivity-cleanup in firectl. The trainer will not be stopped due to inactivity, and GPU usage continues to accrue while the trainer is running, so delete the trainer when you no longer need it.
Manual compatibility cleanup
If you provisioned resources yourself withTrainerJobManager / DeploymentManager instead of the managed service, delete them directly.
Cleaning up RLOR trainer jobs
Cleaning up deployments
minReplicaCount and maxReplicaCount to 0, releasing all accelerators while keeping the deployment available for future scale-up.
Manual cleanup with try/finally
Checking for leaked resources
Track the IDs you create (trainer job IDs + deployment ID) and clean those explicitly. For broad account-wide discovery, use the Fireworks console or the managedfw.*.list() APIs.
Operational guidance
- Delete both policy and reference trainers when running GRPO (which uses 2 RLOR jobs).
- Close the managed service in
finallyso trainer/reference/deployment cleanup runs on Ctrl+C or exceptions. - Don’t delete a trainer while a
save_weights_for_sampleroperation is in progress — wait for it to complete first.