Overview
DeploymentSampler handles client-side tokenization via a HuggingFace tokenizer and returns structured SampledCompletion objects with token IDs, logprobs, and completion metadata. Use it in training scripts that need token-level outputs (e.g. GRPO, DPO).
Constructor
| Parameter | Type | Description |
|---|---|---|
inference_url | str | Gateway URL for inference completions |
model | str | Deployment model path (accounts/<id>/deployments/<id>) |
api_key | str | Fireworks API key |
tokenizer | PreTrainedTokenizer | HuggingFace tokenizer matching the base model |
sample_with_tokens(...)
Sample completions and return structured results with token IDs. This method is async, so call it with await or wrap it with asyncio.run(...) from synchronous code:
Retrieving inference logprobs
For GRPO importance sampling, passlogprobs=True:
Sequence length filtering
sample_with_tokens supports max_seq_len for automatic filtering:
- Prompt pre-filter: If the tokenized prompt already meets or exceeds
max_seq_len, the method returns an empty list immediately — no inference call is made. - Completion post-filter: After sampling, any completion whose full token sequence (prompt + completion) exceeds
max_seq_lenis silently dropped.
SampledCompletion
Each completion returned bysample_with_tokens:
| Field | Type | Description |
|---|---|---|
text | str | Decoded completion text |
full_tokens | List[int] | Prompt + completion token IDs |
prompt_len | int | Number of prompt tokens |
finish_reason | str | "stop", "length", etc. |
completion_len | int | Number of completion tokens |
inference_logprobs | List[float] | None | Per-token logprobs (when logprobs=True is passed) |
logprobs_echoed | bool | True when echo=True was used — logprobs are training-aligned (P+C-1 entries) |
routing_matrices | List[str] | None | Base64-encoded per-token routing matrices for MoE Router Replay (R3) |
Related guides
- DeploymentManager — create and manage deployments
- Training and Sampling — end-to-end workflow
- Cookbook RL recipe — GRPO with sampling pipeline