Fireworks Agent is a hosted Fireworks assistant that owns the full fine-tuning loop. You describe what you want — “fine-tune a model that classifies our support tickets”, “improve Llama 3.1 70B on our function-calling data”, “train a smaller model that matches GPT-4 on our routing task” — and Agent picks the base model, prepares the dataset, runs a hyperparameter sweep, submits training, evaluates the result, and deploys the fine-tuned model. You stay in the loop for approvals and final calls; everything else is handled. Agent is the easiest of the three Fireworks fine-tuning paths, sitting alongside Managed Fine-Tuning and the Training API. It’s the right starting point when you want a working fine-tuned model without writing config files or Python training loops.Documentation Index
Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt
Use this file to discover all available pages before exploring further.
Naming. This documentation refers to the product as Fireworks Agent (or just Agent). You may also see it called
pilot in internal source code, in CLI permission presets (--permission-preset=pilot), in the embedded manifest file (pilot.yaml), and in some legacy support contexts — those are all the same product. Use “Fireworks Agent” or “Agent” in your own prompts and communication.What Agent does for you
Picks the right model
Agent recommends a base model and tuning method (SFT, DPO, or classification) from your task description and a peek at your data.
Plans the run
Inspects your dataset, proposes hyperparameters, estimates cost, and presents a single plan for approval before any spend.
Runs and evaluates
Submits the job, streams progress, evaluates checkpoints, and ships a deployed model at the end.
- Run SFT, DPO, and classification jobs from a natural-language prompt
- Inspect your dataset and call out format issues before training starts
- Recommend a base model from a curated panel based on your task shape
- Run a short hyperparameter sweep before committing to full training
- Stream a live progress feed with eval loss, cost-so-far, and ETA
- Evaluate the trained model against a held-out set and surface the best checkpoint
- Deploy the fine-tuned model so you can call it from
chat/completionsimmediately - Author task-specific evaluators for use in SFT sweeps, or Eval Protocol evaluators you can then run through Managed Fine-Tuning’s RFT path
- Answer questions about your account, deployments, jobs, and Fireworks models along the way
Architecture
The runner is an ephemeral, sandboxed environment with its own filesystem. It executes Agent’s plan against your Fireworks account using your API key. Sessions can pause for hours or days waiting on user input without consuming compute.Two ways to use Agent
- In the Fireworks dashboard
- From the CLI or your coding agent
The default — and recommended — surface for most users. Open Agent in the left nav of app.fireworks.ai for a chat interface that streams Agent’s plan, progress, and final report. Best for:
- Most fine-tuning workflows, end to end
- Teams that want a visual plan, cost, and approval UX
- Watching a long training run with a live progress feed
- Skipping
firectlinstallation and service-account setup
Dashboard quickstart
Open Agent
Click Agent in the left navigation at app.fireworks.ai.
Describe the job
A good first prompt is specific about what you’re training for, what data to use, and what success looks like:Agent will inspect the dataset, propose a plan, and stop for your approval.
Approve the plan
Agent presents one structured plan with a cost estimate. Approve, request a change (“use Qwen3 32B instead”, “skip HP tuning”), or cancel. No spend happens before this gate.
How Agent runs a training job
Every Agent session moves through the same seven phases. Coding agents should expect this sequence; humans can use it as a mental model for what to expect next.| # | Phase | What happens |
|---|---|---|
| 1 | Data inspection | Agent reads your dataset, reports format, sample count, token count, and any issues. |
| 2 | Planning & approval | Agent proposes base model, tuning method, hyperparameters, eval path, and a cost estimate. You approve, edit, or cancel. |
| 3 | HP tuning | A short parallel sweep (typically 3 configs) over LoRA rank and learning rate, capped at 6 active jobs by default. |
| 4 | Full training | The best config from phase 3 runs to completion on the full dataset, with per-epoch eval loss. |
| 5 | Evaluation | The trained model is evaluated against a held-out set using one of three strategies you pick in phase 2: validation loss only (default), an evaluator you provide, or an evaluator Agent generates for you. |
| 6 | Deployment | The model is deployed and a fireworks-ai SDK snippet is ready for inference. |
| 7 | Final report | Deployed model ID, key metrics, total cost, and per-phase summary in one message. |
The approval and cost contract
Agent never spends without an explicit approval. This is structural, not a setting. The preview always includes:- Total estimated cost (in USD, with a confidence range)
- Estimated wall time
- Per-phase cost breakdown (HP tuning / full training / evaluation / deployment)
- Cost-so-far in the session (for re-approvals on long runs)
Out-of-coverage behavior
If you ask Agent to use a model or method outside its supported set, it refuses rather than silently approximating. For example, asking for full-parameter tuning on a model with no Agent recipe returns a clear “not supported in Agent — use Managed Fine-Tuning or the Training API” message with a pointer to the right surface. See When not to use Agent.What Agent can do today
Supervised Fine-Tuning
End-to-end SFT with dataset inspection, hyperparameter sweep, evaluator-guided model selection, and a deployed winner.
Preference Learning (DPO/ORPO)
Run DPO or ORPO on pre-paired preferences or generate pairs automatically with delta learning, with an optional base-model sweep.
Classification
Benchmark base models, fine-tune on labeled data, and compare base vs fine-tuned classification accuracy on a held-out split.
Evaluator authoring
Generate a reusable Python evaluator Agent uses to score candidates during an SFT sweep, or an Eval Protocol evaluator you can take to a Managed RFT job — directly from your dataset.
Use with coding agents
Copy-paste skill files for Claude Code, Cursor, Codex, Aider, and Goose so they can drive Agent for you.
Agent vs Managed Fine-Tuning vs Training API
All three sit on the same training infrastructure, GPU shapes, and tuning methods. The difference is how much you drive.| Fireworks Agent | Managed Fine-Tuning | Training API | |
|---|---|---|---|
| Interface | Natural language (dashboard chat, firectl session, or via coding agent) | UI, firectl, REST | Python script |
| Who picks the model | Agent recommends | You | You |
| Who tunes hyperparameters | Agent runs a sweep | You set them | You set them |
| Cost approval | Built-in gate | None — you submit jobs directly | None |
| Custom loss / training loop | Not supported | Not supported | Supported |
| Inference-in-the-loop eval | Not supported | Not supported | Supported (hotload) |
| Best for | Getting a working fine-tuned model fast, without ML expertise | Production runs with known config | Research, custom RL, hybrid losses |
When not to use Agent
Reach for a more direct surface when:- You need a custom loss function or hybrid objective → Training API
- You need to hotload checkpoints for mid-training inference evaluation → Training API
- You already know your config and just want to submit a job → Managed Fine-Tuning
- You need full-parameter tuning on a model Agent doesn’t cover → Managed Fine-Tuning
- You’re training in a fully automated CI pipeline with no human approval → Agent’s approval gate is interactive by design; Managed Fine-Tuning is the better fit today
Security: service accounts and the Agent manifest
When a coding agent drives Fireworks Agent on your behalf, it should authenticate as a service account with thepilot permission preset, not your personal user key. This enforces a layered permissions model:
Effective permissions = User role ∩ Agent capability manifest
The manifest is a real artifact
The Agent capability manifest is a versioned YAML file (pilot.yaml, kept under its original internal name) embedded into the Fireworks control-plane binary at build time. It enumerates the exact set of RPC methods the pilot preset is allowed to call — roughly 80 methods grouped by capability surface:
- Account & billing —
GetAccountUsage,GetQuota,ListQuotas,ListCosts - Models —
GetModel,ListModels,CreateModelVersion,PrepareModel,ValidateModelUpload - Deployments —
GetDeployment,CreateDeployment,DeployModelVersion,GetDeploymentMetrics - Datasets —
CreateDataset,GetDataset,ListDatasets,PreviewDataset,SplitDataset - Evaluators and evaluations —
CreateEvaluator,GetEvaluator,CreateEvaluation,TestEvaluation - Fine-tuning jobs —
CreateSupervisedFineTuningJob,CreateDpoJob,CreateReinforcementFineTuningJob,CreateRlorTrainerJob(the RFT and RLOR-trainer RPCs are granted by the manifest but Agent’s current workflows don’t use them — see What Agent does for you) - Training shapes —
GetTrainingShape,ListTrainingShapes - Batch inference and inference logs —
CreateBatchInferenceJob,ListInferenceLogs
PERMISSION_DENIED at the API gateway, regardless of how the request was constructed.
Non-destructive guarantee, structurally enforced
Agent’s promise to never delete, cancel, or destroy your existing resources is enforced by the manifest itself, not by skill-level politeness. The manifest does not include anyDelete*, Cancel*, or destructive RPC methods. Even a malicious or hallucinated tool call targeting DeleteModel, CancelReinforcementFineTuningJob, or DeleteDeployment is rejected at the control plane before it reaches the resource layer.
Cross-account reads, never cross-account writes
Thepilot preset is granted read-only access across accounts. This is what lets Agent reach Fireworks-owned public resources — base models at accounts/fireworks/models/..., public deployment shapes, public datasets — using only your account’s API key. Agent cannot write into any other account; mutating operations are scoped to your account.
Auto-update on control-plane releases
Because the manifest is compiled into the control-plane binary, expanded Agent capabilities ship automatically with every control-plane deploy. Your service account stores only the preset name (pilot), not the list of allowed methods — so new capabilities are picked up without rotating keys or re-provisioning the service account. See Service Accounts for setup details.
Session lifecycle reference
| Command | What it does | Confirmation required |
|---|---|---|
firectl session create --instruction "<instruction>" | Start a new session | No |
firectl session events <id> --wait | Stream events until terminal or waiting state | No |
firectl session get <id> | Get current status and details | No |
firectl session list | List sessions for your account | No |
firectl session update <id> --instruction "<answer>" | Send a response to a waiting session | Yes — confirm with the user |
firectl session cancel <id> | Stop a running session (keeps the record) | Yes — confirm with the user |
firectl session delete <id> | Remove the session record (irreversible) | Yes — confirm with the user |
--api-key $FIREWORKS_AGENT_API_KEY for non-interactive auth and --scope optimize (the default scope).
Troubleshooting
My job is stuck in pending
My job is stuck in pending
Agent shares the on-demand pool with the Training API. If GPU capacity is tight, jobs queue. If you need guaranteed capacity, request a reservation.
Agent refused my model or method choice
Agent refused my model or method choice
Agent only runs methods it has curated recipes for. For anything outside that set, use Managed Fine-Tuning or the Training API.
My coding agent dumps the event history and exits immediately
My coding agent dumps the event history and exits immediately
You’re missing the
--wait flag. Without it, firectl session events prints existing events and returns. The Fireworks Agent skill always passes --wait, which keeps the stream open until the session reaches waiting, succeeded, failed, or cancelled. If you’re driving firectl directly, add -w / --wait.The cost preview looks higher than I expected
The cost preview looks higher than I expected
Agent’s preview includes HP tuning, full training, evaluation, and the first hour of deployment. Reject the plan and ask Agent to skip HP tuning or use a smaller base model — the next preview will reflect the lower scope.
Next steps
Try Agent in the dashboard
Open Agent in the left nav at app.fireworks.ai.
Drive Agent from a coding agent
Install the skill file in Claude Code, Cursor, Codex, Aider, or Goose.
Managed Fine-Tuning
Drive the same training infra directly when you know your config.
Training API
Write your own Python training loop on Fireworks GPUs.
Agent crib notes
- Auth: set
FIREWORKS_AGENT_API_KEYin a project-local.env(the key is from a service account with thepilotpermission preset). Source it viasource .env && ...and pass on every command as--api-key $FIREWORKS_AGENT_API_KEY. - Use the same session ID for follow-ups. Never create a new session to continue an existing conversation.
- Always pass
--waittosession events, or the command exits immediately after dumping history. create,get,events, andlistare safe to run without user confirmation. Always confirm with the user beforeupdate,cancel, ordelete.- On
waiting, surface Agent’s exact question to the user verbatim; do not paraphrase. - See Use with coding agents for a complete copy-paste skill for Claude Code, Cursor, Codex, Aider, and Goose.