What model precisions are available and how do I check them?

On this page

Checking Default Precision
Checking Supported Precisions

You can check the available precision types for any model using firectl. Models may support different numerical precisions like FP16, FP8, BF16, or INT8, which affect memory usage and inference speed.

Checking Default Precision

# Check default precision for a specific model
firectl get model accounts/fireworks/models/llama-v3p1-8b-instruct | grep "Default Precision"

Checking Supported Precisions

# Check supported precisions for a model
firectl get model accounts/fireworks/models/llama-v3p1-8b-instruct | grep -E "(Supported Precisions|Supported Precisions With Calibration)"

How can I optimize latency for single replica deployments?

Does Fireworks support custom base models?

⌘I

Account & Access

Billing & Pricing

Deployment & Infrastructure

Models & Inference

Fine-tuning

Security & Compliance

Support & General

What model precisions are available and how do I check them?

Checking Default Precision

Checking Supported Precisions

Account & Access

Billing & Pricing

Deployment & Infrastructure

Models & Inference

Fine-tuning

Security & Compliance

Support & General

​Checking Default Precision

​Checking Supported Precisions

Checking Default Precision

Checking Supported Precisions