The Llama 3.1 405B model uses the FP8 quantization format, which:

Note: BF16 precision will be available soon for on-demand deployments.