Transcribe audio
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The input audio file to transcribe. Common file formats such as mp3, flac, and wav are supported. Note that the audio will be resampled to 16kHz, downmixed to mono, and reformatted to 16-bit signed little-endian format before transcription. Pre-converting the file before sending it to the API can improve runtime performance.
String name of the ASR model to use. Currently "whisper-v3" is supported.
The target language for transcription. The set of supported target languages can be found here.
The input prompt with which to prime transcription. This can be used, for example, to continue a prior transcription given new audio data.
The format in which to return the response. Can be one of json
, text
, srt
, verbose_json
, or vtt
.
Sampling temperature to use when decoding text tokens during transcription.
Response
Was this page helpful?