--draft-token-count
) when creating models - especially effective at low batch sizes--draft-model
) - test Eagle models or contact FireOptimizer team for custom speculators--accelerator-type
to specify hardware when creating deployments--precision FP8
) - reduce computation overhead during deployment creation