Direct routing is deprecated. Our API gateway is now as fast as direct routing and offers additional benefits. Please follow the migration guide below to switch your requests to the API gateway.
Why migrate?
Direct routing used to be the go-to option for ultra-low latency because it bypassed the global API gateway and hit your deployment directly. We have since made major upgrades to the API gateway so it is now as fast as direct routing.
By migrating to the API gateway you get:
- Multi-region reliability — opt in to multi-region deployments for automatic failover across regions
- Region flexibility — move your deployment to other regions without changing your client code
- Sub-10 ms overhead — negligible added latency for most users across the globe
- Automatic retries — the gateway retries many classes of transient errors for you
- One URL for everything — no need to manage a different URL per deployment
How to migrate
The migration is a two-liner change: update the URL and swap your direct-route API key for your standard Fireworks API key.
cURL
curl https://my-account-abcd1234.us-arizona-1.direct.fireworks.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DIRECT_ROUTE_API_KEY" \
-d '{
"model": "accounts/<ACCOUNT_ID>/deployments/<DEPLOYMENT_ID>",
"messages": [
{
"role": "user",
"content": "Explain quantum computing in simple terms"
}
]
}'
What changed:
- URL — replace
https://my-account-abcd1234.us-arizona-1.direct.fireworks.ai/v1 with https://api.fireworks.ai/inference/v1
- API key — replace your direct-route API key with your Fireworks API key
Python (Fireworks SDK)
from fireworks import Fireworks
client = Fireworks(
base_url="https://my-account-abcd1234.us-arizona-1.direct.fireworks.ai",
api_key="<YOUR_DIRECT_ROUTE_API_KEY>"
)
response = client.chat.completions.create(
model="accounts/<ACCOUNT_ID>/deployments/<DEPLOYMENT_ID>",
messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
)
print(response.choices[0].message.content)
What changed:
base_url — remove it; the SDK defaults to api.fireworks.ai
api_key — remove it; the SDK reads the FIREWORKS_API_KEY environment variable automatically
Python (OpenAI SDK)
import os
from openai import OpenAI
client = OpenAI(
api_key="<YOUR_DIRECT_ROUTE_API_KEY>",
base_url="https://my-account-abcd1234.us-arizona-1.direct.fireworks.ai/v1",
)
response = client.chat.completions.create(
model="accounts/<ACCOUNT_ID>/deployments/<DEPLOYMENT_ID>",
messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
)
print(response.choices[0].message.content)
What changed:
base_url — replace the direct-route handle with https://api.fireworks.ai/inference/v1
api_key — use your Fireworks API key (via the FIREWORKS_API_KEY environment variable)
Cloud-specific endpoints
The examples above use api.fireworks.ai, which gives you the lowest latency possible. For advanced use cases, two additional endpoints are available:
api.fireworks.ai — Default, lowest latency for most users
aws.api.fireworks.ai — Use if you are on AWS and want traffic to stay on AWS as much as possible
gcp.api.fireworks.ai — Use if you are on GCP and want traffic to stay on GCP as much as possible
Private connectivity
Private Service Connect (PSC)
Contact your Fireworks representative to set up GCP Private Service Connect
to your deployment.
AWS PrivateLink
Contact your Fireworks representative to set up AWS PrivateLink to your
deployment.