Use Predicted Outputs to boost output generation speeds for editing / rewriting use cases
prediction
field in the Fireworks API with the predicted output. For example, you may want to edit a survey and add an option to contact users by text message:
temperature=0
for best results for most intended use cases of Predicted Outputs. In these cases, using Predicted Outputs does not impact the quality of outputs generatedprediction
field is set by max_tokens
and is 2048 by default, and needs to be updated if you have a longer input and prediction.rewrite_speculation=True
and potentially get even faster output generation. We are working on rolling this out to Serverless soon.