Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt

Use this file to discover all available pages before exploring further.

Fireworks Agent is a hosted Fireworks assistant that owns the full fine-tuning loop. You describe what you want — “fine-tune a model that classifies our support tickets”, “improve Llama 3.1 70B on our function-calling data”, “train a smaller model that matches GPT-4 on our routing task” — and Agent picks the base model, prepares the dataset, runs a hyperparameter sweep, submits training, evaluates the result, and deploys the fine-tuned model. You stay in the loop for approvals and final calls; everything else is handled. Agent is the easiest of the three Fireworks fine-tuning paths, sitting alongside Managed Fine-Tuning and the Training API. It’s the right starting point when you want a working fine-tuned model without writing config files or Python training loops.
Naming. This documentation refers to the product as Fireworks Agent (or just Agent). You may also see it called pilot in internal source code, in CLI permission presets (--permission-preset=pilot), in the embedded manifest file (pilot.yaml), and in some legacy support contexts — those are all the same product. Use “Fireworks Agent” or “Agent” in your own prompts and communication.

What Agent does for you

Picks the right model

Agent recommends a base model and tuning method (SFT, DPO, or classification) from your task description and a peek at your data.

Plans the run

Inspects your dataset, proposes hyperparameters, estimates cost, and presents a single plan for approval before any spend.

Runs and evaluates

Submits the job, streams progress, evaluates checkpoints, and ships a deployed model at the end.
Concretely, Agent can:
  • Run SFT, DPO, and classification jobs from a natural-language prompt
  • Inspect your dataset and call out format issues before training starts
  • Recommend a base model from a curated panel based on your task shape
  • Run a short hyperparameter sweep before committing to full training
  • Stream a live progress feed with eval loss, cost-so-far, and ETA
  • Evaluate the trained model against a held-out set and surface the best checkpoint
  • Deploy the fine-tuned model so you can call it from chat/completions immediately
  • Author task-specific evaluators for use in SFT sweeps, or Eval Protocol evaluators you can then run through Managed Fine-Tuning’s RFT path
  • Answer questions about your account, deployments, jobs, and Fireworks models along the way
Agent does not run RFT training itself — for that, author the evaluator with Agent and then submit the RFT job through Managed Fine-Tuning. Agent also cannot run an arbitrary Python training loop, use a custom loss function, or sample mid-training from your own evaluator — for those, use the Training API directly.

Architecture

The runner is an ephemeral, sandboxed environment with its own filesystem. It executes Agent’s plan against your Fireworks account using your API key. Sessions can pause for hours or days waiting on user input without consuming compute.

Two ways to use Agent

The default — and recommended — surface for most users. Open Agent in the left nav of app.fireworks.ai for a chat interface that streams Agent’s plan, progress, and final report. Best for:
  • Most fine-tuning workflows, end to end
  • Teams that want a visual plan, cost, and approval UX
  • Watching a long training run with a live progress feed
  • Skipping firectl installation and service-account setup

Dashboard quickstart

1

Open Agent

Click Agent in the left navigation at app.fireworks.ai.
2

Describe the job

A good first prompt is specific about what you’re training for, what data to use, and what success looks like:
Fine-tune a model on accounts/your-account/datasets/support-tickets.
Classify each ticket into one of 12 categories.
Target: better than GPT-4 mini on accuracy. Budget: under $5.
Agent will inspect the dataset, propose a plan, and stop for your approval.
3

Approve the plan

Agent presents one structured plan with a cost estimate. Approve, request a change (“use Qwen3 32B instead”, “skip HP tuning”), or cancel. No spend happens before this gate.
4

Watch it run

Agent streams phase-anchored updates every few minutes through the final report, which includes the deployed model ID and inference endpoint.

How Agent runs a training job

Every Agent session moves through the same seven phases. Coding agents should expect this sequence; humans can use it as a mental model for what to expect next.
#PhaseWhat happens
1Data inspectionAgent reads your dataset, reports format, sample count, token count, and any issues.
2Planning & approvalAgent proposes base model, tuning method, hyperparameters, eval path, and a cost estimate. You approve, edit, or cancel.
3HP tuningA short parallel sweep (typically 3 configs) over LoRA rank and learning rate, capped at 6 active jobs by default.
4Full trainingThe best config from phase 3 runs to completion on the full dataset, with per-epoch eval loss.
5EvaluationThe trained model is evaluated against a held-out set using one of three strategies you pick in phase 2: validation loss only (default), an evaluator you provide, or an evaluator Agent generates for you.
6DeploymentThe model is deployed and a fireworks-ai SDK snippet is ready for inference.
7Final reportDeployed model ID, key metrics, total cost, and per-phase summary in one message.
DPO uses the same shape with phase 3 replaced by a preference sweep (or pair generation followed by a preference sweep when the dataset is prompts-only). Classification uses the same shape with phase 3 expanded into a base-model benchmark plus a fine-tuning sweep, and phase 5 reports per-label and overall accuracy. The promotion gate between phase 3 and phase 4 is one of the two user-facing pauses (the other is plan approval in phase 2).

The approval and cost contract

Agent never spends without an explicit approval. This is structural, not a setting.
At the end of Phase 2 (Planning) — and again before any new spend-incurring step — Agent surfaces a structured cost preview and waits for approval. In the dashboard this is a yes/no prompt. From a coding agent, the skill holds the session in a waiting state, surfaces Agent’s exact question, and only proceeds after you respond via firectl session update. Reject and the session ends with no charges.
The preview always includes:
  • Total estimated cost (in USD, with a confidence range)
  • Estimated wall time
  • Per-phase cost breakdown (HP tuning / full training / evaluation / deployment)
  • Cost-so-far in the session (for re-approvals on long runs)

Out-of-coverage behavior

If you ask Agent to use a model or method outside its supported set, it refuses rather than silently approximating. For example, asking for full-parameter tuning on a model with no Agent recipe returns a clear “not supported in Agent — use Managed Fine-Tuning or the Training API” message with a pointer to the right surface. See When not to use Agent.

What Agent can do today

Supervised Fine-Tuning

End-to-end SFT with dataset inspection, hyperparameter sweep, evaluator-guided model selection, and a deployed winner.

Preference Learning (DPO/ORPO)

Run DPO or ORPO on pre-paired preferences or generate pairs automatically with delta learning, with an optional base-model sweep.

Classification

Benchmark base models, fine-tune on labeled data, and compare base vs fine-tuned classification accuracy on a held-out split.

Evaluator authoring

Generate a reusable Python evaluator Agent uses to score candidates during an SFT sweep, or an Eval Protocol evaluator you can take to a Managed RFT job — directly from your dataset.

Use with coding agents

Copy-paste skill files for Claude Code, Cursor, Codex, Aider, and Goose so they can drive Agent for you.

Agent vs Managed Fine-Tuning vs Training API

All three sit on the same training infrastructure, GPU shapes, and tuning methods. The difference is how much you drive.
Fireworks AgentManaged Fine-TuningTraining API
InterfaceNatural language (dashboard chat, firectl session, or via coding agent)UI, firectl, RESTPython script
Who picks the modelAgent recommendsYouYou
Who tunes hyperparametersAgent runs a sweepYou set themYou set them
Cost approvalBuilt-in gateNone — you submit jobs directlyNone
Custom loss / training loopNot supportedNot supportedSupported
Inference-in-the-loop evalNot supportedNot supportedSupported (hotload)
Best forGetting a working fine-tuned model fast, without ML expertiseProduction runs with known configResearch, custom RL, hybrid losses

When not to use Agent

Reach for a more direct surface when:
  • You need a custom loss function or hybrid objective → Training API
  • You need to hotload checkpoints for mid-training inference evaluation → Training API
  • You already know your config and just want to submit a jobManaged Fine-Tuning
  • You need full-parameter tuning on a model Agent doesn’t cover → Managed Fine-Tuning
  • You’re training in a fully automated CI pipeline with no human approval → Agent’s approval gate is interactive by design; Managed Fine-Tuning is the better fit today

Security: service accounts and the Agent manifest

When a coding agent drives Fireworks Agent on your behalf, it should authenticate as a service account with the pilot permission preset, not your personal user key. This enforces a layered permissions model:
Effective permissions = User role ∩ Agent capability manifest

The manifest is a real artifact

The Agent capability manifest is a versioned YAML file (pilot.yaml, kept under its original internal name) embedded into the Fireworks control-plane binary at build time. It enumerates the exact set of RPC methods the pilot preset is allowed to call — roughly 80 methods grouped by capability surface:
  • Account & billingGetAccountUsage, GetQuota, ListQuotas, ListCosts
  • ModelsGetModel, ListModels, CreateModelVersion, PrepareModel, ValidateModelUpload
  • DeploymentsGetDeployment, CreateDeployment, DeployModelVersion, GetDeploymentMetrics
  • DatasetsCreateDataset, GetDataset, ListDatasets, PreviewDataset, SplitDataset
  • Evaluators and evaluationsCreateEvaluator, GetEvaluator, CreateEvaluation, TestEvaluation
  • Fine-tuning jobsCreateSupervisedFineTuningJob, CreateDpoJob, CreateReinforcementFineTuningJob, CreateRlorTrainerJob (the RFT and RLOR-trainer RPCs are granted by the manifest but Agent’s current workflows don’t use them — see What Agent does for you)
  • Training shapesGetTrainingShape, ListTrainingShapes
  • Batch inference and inference logsCreateBatchInferenceJob, ListInferenceLogs
The control plane enforces the manifest as a hard ceiling before checking the underlying user’s role: even if the user has broader permissions, the preset cannot exceed what the manifest allows. Any RPC outside the manifest returns PERMISSION_DENIED at the API gateway, regardless of how the request was constructed.

Non-destructive guarantee, structurally enforced

Agent’s promise to never delete, cancel, or destroy your existing resources is enforced by the manifest itself, not by skill-level politeness. The manifest does not include any Delete*, Cancel*, or destructive RPC methods. Even a malicious or hallucinated tool call targeting DeleteModel, CancelReinforcementFineTuningJob, or DeleteDeployment is rejected at the control plane before it reaches the resource layer.

Cross-account reads, never cross-account writes

The pilot preset is granted read-only access across accounts. This is what lets Agent reach Fireworks-owned public resources — base models at accounts/fireworks/models/..., public deployment shapes, public datasets — using only your account’s API key. Agent cannot write into any other account; mutating operations are scoped to your account.

Auto-update on control-plane releases

Because the manifest is compiled into the control-plane binary, expanded Agent capabilities ship automatically with every control-plane deploy. Your service account stores only the preset name (pilot), not the list of allowed methods — so new capabilities are picked up without rotating keys or re-provisioning the service account. See Service Accounts for setup details.

Session lifecycle reference

CommandWhat it doesConfirmation required
firectl session create --instruction "<instruction>"Start a new sessionNo
firectl session events <id> --waitStream events until terminal or waiting stateNo
firectl session get <id>Get current status and detailsNo
firectl session listList sessions for your accountNo
firectl session update <id> --instruction "<answer>"Send a response to a waiting sessionYes — confirm with the user
firectl session cancel <id>Stop a running session (keeps the record)Yes — confirm with the user
firectl session delete <id>Remove the session record (irreversible)Yes — confirm with the user
All commands accept --api-key $FIREWORKS_AGENT_API_KEY for non-interactive auth and --scope optimize (the default scope).

Troubleshooting

Agent shares the on-demand pool with the Training API. If GPU capacity is tight, jobs queue. If you need guaranteed capacity, request a reservation.
Agent only runs methods it has curated recipes for. For anything outside that set, use Managed Fine-Tuning or the Training API.
You’re missing the --wait flag. Without it, firectl session events prints existing events and returns. The Fireworks Agent skill always passes --wait, which keeps the stream open until the session reaches waiting, succeeded, failed, or cancelled. If you’re driving firectl directly, add -w / --wait.
Agent’s preview includes HP tuning, full training, evaluation, and the first hour of deployment. Reject the plan and ask Agent to skip HP tuning or use a smaller base model — the next preview will reflect the lower scope.

Next steps

Try Agent in the dashboard

Open Agent in the left nav at app.fireworks.ai.

Drive Agent from a coding agent

Install the skill file in Claude Code, Cursor, Codex, Aider, or Goose.

Managed Fine-Tuning

Drive the same training infra directly when you know your config.

Training API

Write your own Python training loop on Fireworks GPUs.
Agent crib notes
  • Auth: set FIREWORKS_AGENT_API_KEY in a project-local .env (the key is from a service account with the pilot permission preset). Source it via source .env && ... and pass on every command as --api-key $FIREWORKS_AGENT_API_KEY.
  • Use the same session ID for follow-ups. Never create a new session to continue an existing conversation.
  • Always pass --wait to session events, or the command exits immediately after dumping history.
  • create, get, events, and list are safe to run without user confirmation. Always confirm with the user before update, cancel, or delete.
  • On waiting, surface Agent’s exact question to the user verbatim; do not paraphrase.
  • See Use with coding agents for a complete copy-paste skill for Claude Code, Cursor, Codex, Aider, and Goose.