model
parameter refers to one of Fireworks models.
stop
: the returned string includes the stop word for Fireworks while it’s omitted for OpenAI (it can be easily truncated on client side)max_tokens
: behaves differently if the model context length is exceeded. If the length of prompt
or messages
plus max_tokens
is higher than the model’s context window, max_tokens
will be adjusted lower accordingly. OpenAI returns invalid request error in this situation. This behavior can be adjusted by context_length_exceeded_behavior
parameter.usage
field is returned in the very last chunk on the response (i.e. the one having finish_reason
set). For example:
usage
field won’t be listed in the SDK’s structure definition. But it can be accessed directly. For example:for chunk in openai.ChatCompletion.create(...): print(chunk["usage"])
.for await (const chunk of await openai.chat.completions.create(...)) { console.log((chunk as any).usage); }
.presence_penalty
frequency_penalty
best_of
: you can use n
insteadlogit_bias
functions
: you can use our LangChain integration to achieve similar functionality client-side