Skip to main content

What this is

RLOR trainer jobs and hotload-enabled deployments hold GPU resources. Always clean up after experiments — especially if jobs terminate unexpectedly.

Cleaning up RLOR trainer jobs

import os
from fireworks.training.sdk import TrainerJobManager, DeploymentManager

api_key = os.environ["FIREWORKS_API_KEY"]
account_id = os.environ.get("FIREWORKS_ACCOUNT_ID", "")
base_url = os.environ.get("FIREWORKS_BASE_URL", "https://api.fireworks.ai")

rlor_mgr = TrainerJobManager(api_key=api_key, account_id=account_id, base_url=base_url)
deploy_mgr = DeploymentManager(api_key=api_key, account_id=account_id, base_url=base_url)

# Delete known trainer jobs from this run
for job_id in ["<policy-job-id>", "<reference-job-id>"]:
    rlor_mgr.delete(job_id=job_id)

Cleaning up deployments

deploy_mgr.delete(deployment_id="<deployment-id>")
If you want to keep the deployment resource but release GPUs (lighter alternative to delete):
deploy_mgr.scale_to_zero(deployment_id="<deployment-id>")
This sets both minReplicaCount and maxReplicaCount to 0, releasing all accelerators while keeping the deployment available for future scale-up.

Automatic cleanup in training scripts

Use try/finally (or atexit) so cleanup runs on Ctrl+C and exceptions:
policy_job_id = "<policy-job-id>"
reference_job_id = "<reference-job-id>"
deployment_id = "research-loop-serving"

try:
    run_training_loop()
finally:
    rlor_mgr.delete(policy_job_id)
    rlor_mgr.delete(reference_job_id)
    deploy_mgr.delete(deployment_id)

Checking for leaked resources

Track the IDs you create (trainer job IDs + deployment ID) and clean those explicitly. For broad account-wide discovery, use the Fireworks console or the managed fw.*.list() APIs.

Operational guidance

  • Delete both policy and reference trainers when running GRPO (which uses 2 RLOR jobs).
  • Register cleanup on atexit in your training scripts for automatic cleanup on Ctrl+C or exceptions.
  • Don’t delete a trainer while a save_weights_for_sampler_ext operation is in progress — wait for it to complete first.