# Custom SSO Source: https://docs.fireworks.ai/accounts/sso Set up custom Single Sign-On (SSO) authentication for Fireworks AI Fireworks uses single sign-on (SSO) as the primary mechanism to authenticate with the platform. By default, Fireworks supports Google SSO. If you have an enterprise account, Fireworks supports bringing your own identity provider using: * OpenID Connect (OIDC) provider * SAML 2.0 provider Coordinate with your Fireworks AI representative to enable the integration. ## OpenID Connect (OIDC) provider Create an OIDC client application in your identity provider, e.g. Okta. Ensure the client is configured for "code authorization" of the "web" type (i.e. with a client\_secret). Set the client's "allowed redirect URL" to the URL provided by Fireworks. It looks like: ``` https://fireworks-.auth.us-west-2.amazoncognito.com/oauth2/idpresponse ``` Note down the `issuer`, `client_id`, and `client_secret` for the newly created client. You will need to provide this to your Fireworks.ai representative to complete your account set up. ## SAML 2.0 provider Create a SAML 2.0 application in your identity provider, e.g. [Okta](https://help.okta.com/en-us/Content/Topics/Apps/Apps_App_Integration_Wizard_SAML.htm). Set the SSO URL to the URL provided by Fireworks. It looks like: ``` https://fireworks-.auth.us-west-2.amazoncognito.com/saml2/idpresponse ``` Configure the Audience URI (SP Entity ID) as provided by Fireworks. It looks like: ``` urn:amazon:cognito:sp: ``` Create an Attribute Statement with the name: ``` http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress ``` and the value `user.email` Leave the rest of the settings as defaults. Note down the "metadata url" for your newly created application. You will need to provide this to your Fireworks AI representative to complete your account set up. ## Troubleshooting ### Invalid samlResponse or relayState from identity provider This error occurs if you are trying to use identity provider (IdP) initiated login. Fireworks currently only supports service provider (SP) initiated login. See [Understanding SAML](https://developer.okta.com/docs/concepts/saml/#understand-sp-initiated-sign-in-flow) for an in-depth explanation. ### Required String parameter 'RelayState' is not present See above. # Managing users Source: https://docs.fireworks.ai/accounts/users Add and delete additional users in your Fireworks account See the concepts [page](/getting-started/concepts#account) for definitions of accounts and users. Only admin users can manage other users within the account. ## Adding users To add a new user to your Fireworks account, run the following command. If the email for the new user is already associated with a Fireworks account, they will have the option to freely switch between your account and their existing account(s). You can also add users in the Fireworks web UI at [https://fireworks.ai/account/users](https://fireworks.ai/account/users). ```bash firectl create user --email="alice@example.com" ``` To create another admin user, pass the `--role=admin` flag: ```bash firectl create user --email="alice@example.com" --role=admin ``` ## Updating a user's role To update a user's role, run ```bash firectl update user --role="{admin,user}" ``` ## Deleting users You can remove a user from your account by running: ```bash firectl delete user ``` # Batch Delete Batch Jobs Source: https://docs.fireworks.ai/api-reference-dlde/batch-delete-batch-jobs post /v1/accounts/{account_id}/batchJobs:batchDelete # Batch Delete Environments Source: https://docs.fireworks.ai/api-reference-dlde/batch-delete-environments post /v1/accounts/{account_id}/environments:batchDelete # Batch Delete Node Pools Source: https://docs.fireworks.ai/api-reference-dlde/batch-delete-node-pools post /v1/accounts/{account_id}/nodePools:batchDelete # Cancel Batch Job Source: https://docs.fireworks.ai/api-reference-dlde/cancel-batch-job post /v1/accounts/{account_id}/batchJobs/{batch_job_id}:cancel Cancels an existing batch job if it is queued, pending, or running. # Connect Environment Source: https://docs.fireworks.ai/api-reference-dlde/connect-environment post /v1/accounts/{account_id}/environments/{environment_id}:connect Connects the environment to a node pool. Returns an error if there is an existing pending connection. # Create Aws Iam Role Binding Source: https://docs.fireworks.ai/api-reference-dlde/create-aws-iam-role-binding post /v1/accounts/{account_id}/awsIamRoleBindings # Create Batch Job Source: https://docs.fireworks.ai/api-reference-dlde/create-batch-job post /v1/accounts/{account_id}/batchJobs # Create Cluster Source: https://docs.fireworks.ai/api-reference-dlde/create-cluster post /v1/accounts/{account_id}/clusters # Create Environment Source: https://docs.fireworks.ai/api-reference-dlde/create-environment post /v1/accounts/{account_id}/environments # Create Node Pool Source: https://docs.fireworks.ai/api-reference-dlde/create-node-pool post /v1/accounts/{account_id}/nodePools # Create Node Pool Binding Source: https://docs.fireworks.ai/api-reference-dlde/create-node-pool-binding post /v1/accounts/{account_id}/nodePoolBindings # Create Snapshot Source: https://docs.fireworks.ai/api-reference-dlde/create-snapshot post /v1/accounts/{account_id}/snapshots # Delete Aws Iam Role Binding Source: https://docs.fireworks.ai/api-reference-dlde/delete-aws-iam-role-binding post /v1/accounts/{account_id}/awsIamRoleBindings:delete # Delete Batch Job Source: https://docs.fireworks.ai/api-reference-dlde/delete-batch-job delete /v1/accounts/{account_id}/batchJobs/{batch_job_id} # Delete Cluster Source: https://docs.fireworks.ai/api-reference-dlde/delete-cluster delete /v1/accounts/{account_id}/clusters/{cluster_id} # Delete Environment Source: https://docs.fireworks.ai/api-reference-dlde/delete-environment delete /v1/accounts/{account_id}/environments/{environment_id} # Delete Node Pool Source: https://docs.fireworks.ai/api-reference-dlde/delete-node-pool delete /v1/accounts/{account_id}/nodePools/{node_pool_id} # Delete Node Pool Binding Source: https://docs.fireworks.ai/api-reference-dlde/delete-node-pool-binding post /v1/accounts/{account_id}/nodePoolBindings:delete # Delete Snapshot Source: https://docs.fireworks.ai/api-reference-dlde/delete-snapshot delete /v1/accounts/{account_id}/snapshots/{snapshot_id} # Disconnect Environment Source: https://docs.fireworks.ai/api-reference-dlde/disconnect-environment post /v1/accounts/{account_id}/environments/{environment_id}:disconnect Disconnects the environment from the node pool. Returns an error if the environment is not connected to a node pool. # Get Batch Job Source: https://docs.fireworks.ai/api-reference-dlde/get-batch-job get /v1/accounts/{account_id}/batchJobs/{batch_job_id} # Get Batch Job Logs Source: https://docs.fireworks.ai/api-reference-dlde/get-batch-job-logs get /v1/accounts/{account_id}/batchJobs/{batch_job_id}:getLogs # Get Cluster Source: https://docs.fireworks.ai/api-reference-dlde/get-cluster get /v1/accounts/{account_id}/clusters/{cluster_id} # Get Cluster Connection Info Source: https://docs.fireworks.ai/api-reference-dlde/get-cluster-connection-info get /v1/accounts/{account_id}/clusters/{cluster_id}:getConnectionInfo Retrieve connection settings for the cluster to be put in kubeconfig # Get Environment Source: https://docs.fireworks.ai/api-reference-dlde/get-environment get /v1/accounts/{account_id}/environments/{environment_id} # Get Node Pool Source: https://docs.fireworks.ai/api-reference-dlde/get-node-pool get /v1/accounts/{account_id}/nodePools/{node_pool_id} # Get Node Pool Stats Source: https://docs.fireworks.ai/api-reference-dlde/get-node-pool-stats get /v1/accounts/{account_id}/nodePools/{node_pool_id}:getStats # Get Snapshot Source: https://docs.fireworks.ai/api-reference-dlde/get-snapshot get /v1/accounts/{account_id}/snapshots/{snapshot_id} # List Aws Iam Role Bindings Source: https://docs.fireworks.ai/api-reference-dlde/list-aws-iam-role-bindings get /v1/accounts/{account_id}/awsIamRoleBindings # List Batch Jobs Source: https://docs.fireworks.ai/api-reference-dlde/list-batch-jobs get /v1/accounts/{account_id}/batchJobs # List Clusters Source: https://docs.fireworks.ai/api-reference-dlde/list-clusters get /v1/accounts/{account_id}/clusters # List Environments Source: https://docs.fireworks.ai/api-reference-dlde/list-environments get /v1/accounts/{account_id}/environments # List Node Pool Bindings Source: https://docs.fireworks.ai/api-reference-dlde/list-node-pool-bindings get /v1/accounts/{account_id}/nodePoolBindings # List Node Pools Source: https://docs.fireworks.ai/api-reference-dlde/list-node-pools get /v1/accounts/{account_id}/nodePools # List Snapshots Source: https://docs.fireworks.ai/api-reference-dlde/list-snapshots get /v1/accounts/{account_id}/snapshots # Update Batch Job Source: https://docs.fireworks.ai/api-reference-dlde/update-batch-job patch /v1/accounts/{account_id}/batchJobs/{batch_job_id} # Update Cluster Source: https://docs.fireworks.ai/api-reference-dlde/update-cluster patch /v1/accounts/{account_id}/clusters/{cluster_id} # Update Environment Source: https://docs.fireworks.ai/api-reference-dlde/update-environment patch /v1/accounts/{account_id}/environments/{environment_id} # Update Node Pool Source: https://docs.fireworks.ai/api-reference-dlde/update-node-pool patch /v1/accounts/{account_id}/nodePools/{node_pool_id} # Align transcription Source: https://docs.fireworks.ai/api-reference/audio-alignments post /audio/alignments ### Request ##### (multi-part form) The input audio file to align with text. Common file formats such as mp3, flac, and wav are supported. Note that the audio will be resampled to 16kHz, downmixed to mono, and reformatted to 16-bit signed little-endian format before transcription. Pre-converting the file before sending it to the API can improve runtime performance The text to align with the audio. String name of the voice activity detection (VAD) model to use. Can be one of `silero`, or `whisperx-pyannet`. String name of the alignment model to use. Currently supported: * `mms_fa` optimal accuracy for multilingual speech. * `tdnn_ffn` optimal accuracy for English-only speech. * `gentle` best accuracy for English-only speech (requires a dedicated endpoint, contact us at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai)). The format in which to return the response. Can be one of `srt`, `verbose_json`, or `vtt`. Audio preprocessing mode. Currently supported: * `none` to skip audio preprocessing. * `dynamic` for arbitrary audio content with variable loudness. * `soft_dynamic` for speech intense recording such as podcasts and voice-overs. * `bass_dynamic` for boosting lower frequencies; ### Response The task which was performed. Either `transcribe` or `translate`. The language of the transcribed/translated text. The duration of the transcribed/translated audio, in seconds. The transcribed/translated text. Extracted words and their corresponding timestamps. The text content of the word. Start time of the word in seconds. End time of the word in seconds. Segments of the transcribed/translated text and their corresponding details. ```curl curl # Download audio file curl -sL -o "audio.flac" "https://tinyurl.com/3pddjjdc" # Make request curl -X POST "http://api.fireworks.ai/inference/v1/audio/alignments" \ -H "Authorization: Bearer " \ -F "file=@audio.flac" \ -F "text=At this turning point of history there manifest themselves, side by side and often mixed and entangled together, a magnificent, manifold, virgin forest-like upgrowth and upstriving, a kind of tropical tempo in the rivalry of growth, and an extraordinary decay and self-destruction owing to the savagely opposing and seemingly exploding egoisms which strive with one another for sun and light, and can no longer assign any limit, restraint, or forbearance for themselves by means of the hitherto existing morality" ``` ```python python !pip install fireworks-ai from fireworks.client.audio import AudioInference # Prepare client audio = requests.get("https://tinyurl.com/3pddjjdc").content text = "At this turning point of history there manifest themselves, side by side and often mixed and entangled together, a magnificent, manifold, virgin forest-like upgrowth and upstriving, a kind of tropical tempo in the rivalry of growth, and an extraordinary decay and self-destruction owing to the savagely opposing and seemingly exploding egoisms which strive with one another for sun and light, and can no longer assign any limit, restraint, or forbearance for themselves by means of the hitherto existing morality" client = AudioInference( model="whisper-v3-turbo", base_url="https://audio-prod.us-virginia-1.direct.fireworks.ai", api_key="", ) # Make request start = time.time() r = await client.align_async(audio=audio, text=text) print(f"Took: {(time.time() - start):.3f}s. Response: '{r}'") ``` # Streaming Transcription Source: https://docs.fireworks.ai/api-reference/audio-streaming-transcriptions websocket /audio/transcriptions/streaming Streaming transcription is performed over a WebSocket. Provide the transcription parameters and establish a WebSocket connection to the endpoint. Stream short audio chunks (50-400ms) in binary frames of PCM 16-bit little-endian at 16kHz sample rate and single channel (mono). In parallel, receive transcription from the WebSocket. Stream audio to get transcription continuously in real-time. Stream audio to get transcription continuously in real-time. Stream audio to get transcription continuously in real-time. ### URL Please use the following serverless endpoint: ``` wss://audio-streaming.us-virginia-1.direct.fireworks.ai/v1/audio/transcriptions/streaming ``` ### Headers Your Fireworks API key, e.g. `Authorization=API_KEY`. ### Query Parameters The format in which to return the response. Currently only `verbose_json` is recommended for streaming. The target language for transcription. The set of supported target languages can be found [here](https://github.com/openai/whisper/blob/ba3f3cd54b0e5b8ce1ab3de13e32122d0d5f98ab/whisper/tokenizer.py#L10-L128). The input prompt that the model will use when generating the transcription. Can be used to specify custom words or specify the style of the transcription. E.g. `Um, here's, uh, what was recorded.` will make the model to include the filler words into the transcription. Sampling temperature to use when decoding text tokens during transcription. ### Streaming Audio Stream short audio chunks (50-400ms) in binary frames of PCM 16-bit little-endian at 16kHz sample rate and single channel (mono). Typically, you will: 1. Resample your audio to 16 kHz if it is not already. 2. Convert it to mono. 3. Send 50ms chunks (16,000 Hz \* 0.05s = 800 samples) of audio in 16-bit PCM (signed, little-endian) format. ### Handling Responses The client maintains a state dictionary, starting with an empty dictionary `{}`. When the server sends the first transcription message, it contains a list of segments. Each segment has an `id` and `text`: ```python # Server initial message: { "segments": [ {"id": "0", "text": "This is the first sentence"}, {"id": "1", "text": "This is the second sentence"} ] } # Client initial state: { "0": "This is the first sentence", "1": "This is the second sentence", } ``` When the server sends the next updates to the transcription, the client updates the state dictionary based on the segment `id`: ```python # Server continuous message: { "segments": [ {"id": "1", "text": "This is the second sentence modified"}, {"id": "2", "text": "This is the third sentence"} ] } # Client updated state: { "0": "This is the first sentence", "1": "This is the second sentence modified", # overwritten "2": "This is the third sentence", # new } ``` ### Example Usage Check out a brief Python example below or example sources: * [Python notebook](https://colab.research.google.com/github/fw-ai/cookbook/blob/main/learn/audio/audio_streaming_speech_to_text/audio_streaming_speech_to_text.ipynb) * [Python sources](https://github.com/fw-ai/cookbook/tree/main/learn/audio/audio_streaming_speech_to_text/python) * [Node.js sources](https://github.com/fw-ai/cookbook/tree/main/learn/audio/audio_streaming_speech_to_text/nodejs) ```python !pip3 install requests torch torchaudio websocket-client import io import time import json import torch import requests import torchaudio import threading import websocket import urllib.parse lock = threading.Lock() state = {} def on_open(ws): def send_audio_chunks(): for chunk in audio_chunk_bytes: ws.send(chunk, opcode=websocket.ABNF.OPCODE_BINARY) time.sleep(chunk_size_ms / 1000) final_checkpoint = json.dumps({"checkpoint_id": "final"}) ws.send(final_checkpoint, opcode=websocket.ABNF.OPCODE_TEXT) threading.Thread(target=send_audio_chunks).start() def on_message(ws, message): message = json.loads(message) if message.get("checkpoint_id") == "final": ws.close() return update = {s["id"]: s["text"] for s in message["segments"]} with lock: state.update(update) print("\n".join(f" - {k}: {v}" for k, v in state.items())) def on_error(ws, error): print(f"WebSocket error: {error}") # Open a connection URL with query params url = "ws://audio-streaming.us-virginia-1.direct.fireworks.ai/v1/audio/transcriptions/streaming" params = urllib.parse.urlencode({ "language": "en", }) ws = websocket.WebSocketApp( f"{url}?{params}", header={"Authorization": ""}, on_open=on_open, on_message=on_message, on_error=on_error, ) ws.run_forever() ``` ### Dedicated endpoint For fixed throughput and predictable SLAs, you may request a dedicated endpoints for streaming transcription at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai) or [discord](https://www.google.com/url?q=https%3A%2F%2Fdiscord.gg%2Ffireworks-ai). # Transcribe audio Source: https://docs.fireworks.ai/api-reference/audio-transcriptions post /audio/transcriptions Send a sample audio to get a transcription. ### Request ##### (multi-part form) The input audio file to transcribe or an URL to the public audio file. Max audio file size is 1 GB, there is no limit for audio duration. Common file formats such as mp3, flac, and wav are supported. Note that the audio will be resampled to 16kHz, downmixed to mono, and reformatted to 16-bit signed little-endian format before transcription. Pre-converting the file before sending it to the API can improve runtime performance. String name of the ASR model to use. Can be one of `whisper-v3` or `whisper-v3-turbo`. Please use the following serverless endpoints: * [https://audio-prod.us-virginia-1.direct.fireworks.ai](https://audio-prod.us-virginia-1.direct.fireworks.ai) (for `whisper-v3`); * [https://audio-turbo.us-virginia-1.direct.fireworks.ai](https://audio-turbo.us-virginia-1.direct.fireworks.ai) (for `whisper-v3-turbo`); String name of the voice activity detection (VAD) model to use. Can be one of `silero`, or `whisperx-pyannet`. String name of the alignment model to use. Currently supported: * `mms_fa` optimal accuracy for multilingual speech. * `tdnn_ffn` optimal accuracy for English-only speech. * `gentle` best accuracy for English-only speech (requires a dedicated endpoint, contact us at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai)). The target language for transcription. The set of supported target languages can be found [here](https://github.com/openai/whisper/blob/ba3f3cd54b0e5b8ce1ab3de13e32122d0d5f98ab/whisper/tokenizer.py#L10-L128). The input prompt that the model will use when generating the transcription. Can be used to specify custom words or specify the style of the transcription. E.g. `Um, here's, uh, what was recorded.` will make the model to include the filler words into the transcription. Sampling temperature to use when decoding text tokens during transcription. The format in which to return the response. Can be one of `json`, `text`, `srt`, `verbose_json`, or `vtt`. The timestamp granularities to populate for this transcription. response\_format must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported. Can be one of `word`, or `segment`. If not present, defaults to `segment`. Audio preprocessing mode. Currently supported: * `none` to skip audio preprocessing. * `dynamic` for arbitrary audio content with variable loudness. * `soft_dynamic` for speech intense recording such as podcasts and voice-overs. * `bass_dynamic` for boosting lower frequencies; ### Response The task which was performed. Either `transcribe` or `translate`. The language of the transcribed/translated text. The duration of the transcribed/translated audio, in seconds. The transcribed/translated text. Extracted words and their corresponding timestamps. The text content of the word. Start time of the word in seconds. End time of the word in seconds. Segments of the transcribed/translated text and their corresponding details. ```curl curl # Download audio file curl -L -o "audio.flac" "https://tinyurl.com/4997djsh" # Make request curl -X POST "https://audio-prod.us-virginia-1.direct.fireworks.ai/v1/audio/transcriptions" \ -H "Authorization: " \ -F "file=@audio.flac" ``` ```python python !pip install fireworks-ai requests from fireworks.client.audio import AudioInference # Prepare client audio = requests.get("https://tinyurl.com/4cb74vas").content client = AudioInference( model="whisper-v3", base_url="https://audio-prod.us-virginia-1.direct.fireworks.ai", # # Or for the turbo version # model="whisper-v3-turbo", # base_url="https://audio-turbo.us-virginia-1.direct.fireworks.ai", api_key="", ) # Make request start = time.time() r = await client.transcribe_async(audio=audio) print(f"Took: {(time.time() - start):.3f}s. Text: '{r.text}'") ``` # Translate audio Source: https://docs.fireworks.ai/api-reference/audio-translations post /audio/translations ### Request ##### (multi-part form) The input audio file to translate or an URL to the public audio file. Max audio file size is 1 GB, there is no limit for audio duration. Common file formats such as mp3, flac, and wav are supported. Note that the audio will be resampled to 16kHz, downmixed to mono, and reformatted to 16-bit signed little-endian format before transcription. Pre-converting the file before sending it to the API can improve runtime performance. String name of the ASR model to use. Can be one of `whisper-v3` or `whisper-v3-turbo`. Please use the following serverless endpoints: * [https://audio-prod.us-virginia-1.direct.fireworks.ai](https://audio-prod.us-virginia-1.direct.fireworks.ai) (for `whisper-v3`); * [https://audio-turbo.us-virginia-1.direct.fireworks.ai](https://audio-turbo.us-virginia-1.direct.fireworks.ai) (for `whisper-v3-turbo`); String name of the voice activity detection (VAD) model to use. Can be one of `silero`, or `whisperx-pyannet`. String name of the alignment model to use. Currently supported: * `mms_fa` optimal accuracy for multilingual speech. * `tdnn_ffn` optimal accuracy for English-only speech. * `gentle` best accuracy for English-only speech (requires a dedicated endpoint, contact us at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai)). The target language for transcription. The set of supported target languages can be found [here](https://github.com/openai/whisper/blob/ba3f3cd54b0e5b8ce1ab3de13e32122d0d5f98ab/whisper/tokenizer.py#L10-L128). The input prompt that the model will use when generating the transcription. Can be used to specify custom words or specify the style of the transcription. E.g. `Um, here's, uh, what was recorded.` will make the model to include the filler words into the transcription. Sampling temperature to use when decoding text tokens during transcription. The format in which to return the response. Can be one of `json`, `text`, `srt`, `verbose_json`, or `vtt`. The timestamp granularities to populate for this transcription. response\_format must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported. Can be one of `word`, or `segment`. If not present, defaults to `segment`. Audio preprocessing mode. Currently supported: * `none` to skip audio preprocessing. * `dynamic` for arbitrary audio content with variable loudness. * `soft_dynamic` for speech intense recording such as podcasts and voice-overs. * `bass_dynamic` for boosting lower frequencies; ### Response The task which was performed. Either `transcribe` or `translate`. The language of the transcribed/translated text. The duration of the transcribed/translated audio, in seconds. The transcribed/translated text. Extracted words and their corresponding timestamps. The text content of the word. Start time of the word in seconds. End time of the word in seconds. Segments of the transcribed/translated text and their corresponding details. ```curl curl # Download audio file curl -L -o "audio.flac" "https://tinyurl.com/4997djsh" # Make request curl -X POST "https://audio-prod.us-virginia-1.direct.fireworks.ai/v1/audio/translations" \ -H "Authorization: " \ -F "file=@audio.flac" ``` ```python python !pip install fireworks-ai requests from fireworks.client.audio import AudioInference # Prepare client audio = requests.get("https://tinyurl.com/4cb74vas").content client = AudioInference( model="whisper-v3", base_url="https://audio-prod.us-virginia-1.direct.fireworks.ai", # # Or for the turbo version # model="whisper-v3-turbo", # base_url="https://audio-turbo.us-virginia-1.direct.fireworks.ai", api_key="", ) # Make request start = time.time() r = await client.translate_async(audio=audio) print(f"Took: {(time.time() - start):.3f}s. Text: '{r.text}'") ``` # Create API Key Source: https://docs.fireworks.ai/api-reference/create-api-key post /v1/accounts/{account_id}/apiKeys # Create Dataset Source: https://docs.fireworks.ai/api-reference/create-dataset post /v1/accounts/{account_id}/datasets # CRUD APIs for deployed models. Source: https://docs.fireworks.ai/api-reference/create-deployed-model post /v1/accounts/{account_id}/deployedModels # Create Deployment Source: https://docs.fireworks.ai/api-reference/create-deployment post /v1/accounts/{account_id}/deployments # Create Model Source: https://docs.fireworks.ai/api-reference/create-model post /v1/accounts/{account_id}/models # Create User Source: https://docs.fireworks.ai/api-reference/create-user post /v1/accounts/{account_id}/users # Create embeddings Source: https://docs.fireworks.ai/api-reference/creates-an-embedding-vector-representing-the-input-text post /embeddings # Delete API Key Source: https://docs.fireworks.ai/api-reference/delete-api-key post /v1/accounts/{account_id}/apiKeys:delete # Delete Dataset Source: https://docs.fireworks.ai/api-reference/delete-dataset delete /v1/accounts/{account_id}/datasets/{dataset_id} # null Source: https://docs.fireworks.ai/api-reference/delete-deployed-model delete /v1/accounts/{account_id}/deployedModels/{deployed_model_id} # Delete Deployment Source: https://docs.fireworks.ai/api-reference/delete-deployment delete /v1/accounts/{account_id}/deployments/{deployment_id} # Delete Model Source: https://docs.fireworks.ai/api-reference/delete-model delete /v1/accounts/{account_id}/models/{model_id} # Generate an image Source: https://docs.fireworks.ai/api-reference/generate-a-new-image-from-a-text-prompt Official API reference for image generation workloads can be found on the corresponding models pages, upon clicking "view code". We support generating images from text prompts, other images, and/or ControlNet [https://fireworks.ai/models/fireworks/stable-diffusion-xl-1024-v1-0](https://fireworks.ai/models/fireworks/stable-diffusion-xl-1024-v1-0) [https://fireworks.ai/models/fireworks/SSD-1B](https://fireworks.ai/models/fireworks/SSD-1B) [https://fireworks.ai/models/fireworks/playground-v2-1024px-aesthetic](https://fireworks.ai/models/fireworks/playground-v2-1024px-aesthetic) [https://fireworks.ai/models/fireworks/japanese-stable-diffusion-xl](https://fireworks.ai/models/fireworks/japanese-stable-diffusion-xl) # Get Account Source: https://docs.fireworks.ai/api-reference/get-account get /v1/accounts/{account_id} # Get Dataset Source: https://docs.fireworks.ai/api-reference/get-dataset get /v1/accounts/{account_id}/datasets/{dataset_id} # Get Dataset Upload Endpoint Source: https://docs.fireworks.ai/api-reference/get-dataset-upload-endpoint post /v1/accounts/{account_id}/datasets/{dataset_id}:getUploadEndpoint # Get Deployment Source: https://docs.fireworks.ai/api-reference/get-deployment get /v1/accounts/{account_id}/deployments/{deployment_id} # Get Model Source: https://docs.fireworks.ai/api-reference/get-model get /v1/accounts/{account_id}/models/{model_id} # Get Model Download Endpoint Source: https://docs.fireworks.ai/api-reference/get-model-download-endpoint get /v1/accounts/{account_id}/models/{model_id}:getDownloadEndpoint # Get Model Upload Endpoint Source: https://docs.fireworks.ai/api-reference/get-model-upload-endpoint post /v1/accounts/{account_id}/models/{model_id}:getUploadEndpoint # Get User Source: https://docs.fireworks.ai/api-reference/get-user get /v1/accounts/{account_id}/users/{user_id} # Introduction Source: https://docs.fireworks.ai/api-reference/introduction Fireworks AI REST API enables you to interact with various Language, Image and Embedding Models using the API Key. ## Authentication All requests made to the Fireworks AI via REST API must include an `Authorization` header. Header should specify a valid `Bearer` Token with API key and must be encoded as JSON with the "Content-Type: application/json" header. This ensures that your requests are properly authenticated and formatted for interaction with the Fireworks AI. A Sample header to be included in the REST API request should look like below: ```json authorization: Bearer ``` # List API Keys Source: https://docs.fireworks.ai/api-reference/list-api-keys get /v1/accounts/{account_id}/apiKeys # List Datasets Source: https://docs.fireworks.ai/api-reference/list-datasets get /v1/accounts/{account_id}/datasets # List Deployments Source: https://docs.fireworks.ai/api-reference/list-deployments get /v1/accounts/{account_id}/deployments # List Models Source: https://docs.fireworks.ai/api-reference/list-models get /v1/accounts/{account_id}/models # List Users Source: https://docs.fireworks.ai/api-reference/list-users get /v1/accounts/{account_id}/users # Create Chat Completion Source: https://docs.fireworks.ai/api-reference/post-chatcompletions post /chat/completions Creates a model response for the given chat conversation. # Create Completion Source: https://docs.fireworks.ai/api-reference/post-completions post /completions Creates a completion for the provided prompt and parameters. # Update Dataset Source: https://docs.fireworks.ai/api-reference/update-dataset patch /v1/accounts/{account_id}/datasets/{dataset_id} # Update Deployment Source: https://docs.fireworks.ai/api-reference/update-deployment patch /v1/accounts/{account_id}/deployments/{deployment_id} # Update Model Source: https://docs.fireworks.ai/api-reference/update-model patch /v1/accounts/{account_id}/models/{model_id} # Update User Source: https://docs.fireworks.ai/api-reference/update-user patch /v1/accounts/{account_id}/users/{user_id} # Upload Dataset Files Source: https://docs.fireworks.ai/api-reference/upload-dataset-files post /v1/accounts/{account_id}/datasets/{dataset_id}:upload Provides a streamlined way to upload a dataset file in a single API request. This path can handle file sizes up to 150Mb. For larger file sizes use [Get Dataset Upload Endpoint](get-dataset-upload-endpoint). # Validate Dataset Upload Source: https://docs.fireworks.ai/api-reference/validate-dataset-upload post /v1/accounts/{account_id}/datasets/{dataset_id}:validateUpload # Validate Model Upload Source: https://docs.fireworks.ai/api-reference/validate-model-upload get /v1/accounts/{account_id}/models/{model_id}:validateUpload # Start here Source: https://docs.fireworks.ai/cookbook/cookbook_landing The **Fireworks Cookbook** is your hands-on guide to building, deploying, and fine-tuning generative AI and agentic workflows. It offers curated examples, Jupyter Notebooks, apps, and resources tailored to various use cases and skill levels, making it a go-to resource for practical Fireworks implementations. In this cookbook, you’ll find: * **Production-ready projects**: Scalable, proven solutions with ongoing support from the Fireworks engineering team. * **Learning-focused tutorials**: Step-by-step guides for hands-on exploration, ideal for interactive learning of AI techniques. * **Community-driven showcases**: Creative user-contributed projects that showcase innovative applications of Fireworks in diverse contexts. *** ## Repository structure To help you easily navigate and find the right resources, the Cookbook organizes examples by purpose:
**Hands-on projects for learning AI** techniques, maintained by the DevRel team.

**Explore user-contributed projects** that push creative boundaries with Fireworks.
*** ### Feedback & support We value your feedback! If you encounter issues, need clarification, or have questions, please contact us at * **Discord Community**: [discord.gg/fireworks-ai](https://discord.gg/fireworks-ai) * **Email Support**: [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai) *** **Additional resources:** * [Fireworks AI Blog](https://fireworks.ai/blog) * [Fireworks AI YouTube](https://www.youtube.com/channel/UCHCffBTGYa1Ut72h03ldtGA) * [Fireworks AI Twitter](https://x.com/fireworksai_hq) # Build with Fireworks Source: https://docs.fireworks.ai/cookbook/learn_with_fireworks/ecosystem_examples Step-by-step guides for hands-on exploration, ideal for interactive learning of AI techniques. ## Inference Explore notebooks and projects showcasing how to run generative AI models on Fireworks, demonstrating both third-party integrations and innovative applications with industry-leading speed and flexibility. ### LLMs Dive into examples that utilize Fireworks for deploying and fine-tuning large language models (LLMs), featuring integrations with popular libraries and cutting-edge use cases. **Notebooks** (Python) An interactive Streamlit app for comparing LLMs on Fireworks with parameter tuning and LLM-as-a-Judge functionality. (Python) Demonstrates structured responses using Llama 3.1, covering Grammar Mode and JSON Mode for consistent output formats. (Python) Explores generating synthetic data with Llama 3.1 models on Fireworks, including structured outputs for quizzes. (Python) Uses DeepSeek V3 & R1 to generate structured PC specifications while explaining component choices using Reasoning JSON Mode. (Python) Demonstrates structured patient record generation using Reasoning JSON Mode to explain treatment recommendations. **Apps** A Next.js app for real-time transcription chat using Fireworks and Vercel integration. ### Visual-language Discover projects combining vision and language capabilities using Fireworks, integrating external frameworks for seamless multimodal understanding. ### Audio Explore real-time audio transcription, processing, and generation examples using Fireworks’ advanced audio models and integrations. **Notebooks** A notebook demonstrating real-time audio transcription using Fireworks' `whisper-v3-large` compatible model. The project includes streaming audio input and getting transcription messages, making it ideal for tasks requiring accurate and responsive audio processing. Stream audio to get transcription continuously in real-time. Stream audio to get transcription continuously in real-time. ### Image Experiment with image-based projects using Fireworks’ models, enhanced with third-party libraries for innovative applications in image creation, manipulation, and recognition. ### Multimodal Learn from complex multimodal examples that blend text, audio, and image inputs, demonstrating the full potential of Fireworks combined with external tools for interactive AI experiences. *** ## Fine-tuning Access notebooks that demonstrate efficient model fine-tuning on Fireworks, utilizing both internal capabilities and third-party tools like Axolotl for custom optimization. ### Multi-LoRA Explore notebooks showcasing the integration and utilization of multiple LoRA adapters in Fireworks. These resources demonstrate advanced techniques for merging, fine-tuning, and deploying multi-LoRA configurations to optimize model performance across diverse tasks. **Notebooks** (Python) An interactive guide showcasing the integration of Multi-LoRA adapters on Fireworks, enabling fine-tuned responses for diverse product domains such as beauty, fashion, outdoor gear, and baby products. *** ## Function calling Explore examples of function-calling workflows using Fireworks, showcasing how to integrate with external APIs and tools for sophisticated, multi-step AI operations. **Notebooks** Demonstrates Function-Calling with LangChain integration, including custom tool routing and query handling. (Python) Explore the integration of Fireworks' function-calling model with LangChain tools. This notebook demonstrates building basic agents using `firefunction-v1` for tasks like answering questions, retrieving stock prices, and generating images with the Fireworks SDXL API (Javascript). Showcases Function-Calling with LangGraph integration for graph-based agent systems and tool queries. (Python) Uses Fireworks' Function-Calling for structured QA with OpenAI, featuring multi-turn conversation handling. (Python) Demonstrates querying financial data using Fireworks' Function-Calling API with integrated tool setup. (Python) Extracts structured information from web content using Fireworks' Function-Calling API. (Python) Generates stock charts using Fireworks' Function-Calling API with AutoGen integration. (Python) **Apps** A demo app showcasing chat with function-calling capabilities for dynamic service invocation. *** ## RAG Build retrieval-augmented generation (RAG) systems with Fireworks, featuring projects that connect with vector databases and search tools for enhanced, context-aware AI responses. **Notebooks** A basic RAG implementation using ChromaDB with League of Legends data, comparing responses across multiple models. (Python) An agentic system using RAG for generating catchy research paper titles with embeddings and LLM completions. (Python) A movie recommendation system using Fireworks' function-calling models and MongoDB Atlas for personalized, real-time suggestions. (Python) **Apps** A RAG chatbot using SurrealDB for vector storage and Fireworks for real-time, context-aware responses. *** ### Integration partners We welcome contributions from integration partners! Follow these steps: 1. **Clone the Repo**: [Fireworks Cookbook repo](https://github.com/fw-ai/cookbook) 2. **Create Folder**: Add your company/tool under `integrations` 3. **Add Examples**: Include code, notebooks, or demos 4. **Use Template**: Fill out the [integration guide](https://github.com/fw-ai/cookbook/blob/main/integrations/template_integration_guide.md) 5. **Submit PR**: Create a pull request 6. **Review**: Fireworks will review and merge Need help? Contact us or open an issue. *** ### Support For help or feedback: * **Discord**: [Join us](https://discord.gg/fireworks-ai) * **Email**: [Contact us](mailto:inquiries@fireworks.ai) **Resources**: * [Blog](https://fireworks.ai/blog) * [YouTube](https://www.youtube.com/channel/UCHCffBTGYa1Ut72h03ldtGA) * [Twitter](https://x.com/fireworksai_hq) # Community showcase Source: https://docs.fireworks.ai/cookbook/projects_showcase/community_examples Creative user-contributed projects that showcase innovative applications of Fireworks in diverse contexts. Convert any PDF into a personalized podcast using open-source LLMs and TTS models. Powered by Fireworks-hosted Llama 3.1, MeloTTS, and Bark, this app generates engaging dialogue and outputs it as an MP3 file via a user-friendly Gradio interface. High-throughput code generation with Qwen2.5 Coder models, optimized for fast inference on Fireworks. Includes a robust pipeline for data creation, fine-tuning with Unsloth, and real-time application in AI-powered code editors. Ensure accurate and reliable technical documentation with ProoferX, built using Fireworks’ fast Llama models and Firefunc for structured output. This project addresses a key challenge in developer tools by validating and streamlining documentation with real-time checks. *** ## Community project submissions We welcome your contributions to the **Fireworks Cookbook**! Share your projects and help expand our collaborative resource. Here’s how: 1. **Clone the Repo**: [Fireworks Cookbook](https://github.com/fw-ai/cookbook) and go to `showcase`. 2. **Create Folder**: Add a folder named after your project. 3. **Include Code**: Add notebooks, apps, or other resources demonstrating your project. 4. **Complete Template**: Fill out the [Showcase Template](https://github.com/fw-ai/cookbook/blob/main/showcase/template_projectMDX.md) for key project details. 5. **Submit PR**: Submit your project as a pull request. 6. **Review & Feature**: Our team will review your submission; selected projects may be highlighted in docs or social media. *** ### Support For help or feedback: * **Discord**: [Join us](https://discord.gg/fireworks-ai) * **Email**: [Contact us](mailto:inquiries@fireworks.ai) **Resources**: * [Blog](https://fireworks.ai/blog) * [YouTube](https://www.youtube.com/channel/UCHCffBTGYa1Ut72h03ldtGA) * [Twitter](https://x.com/fireworksai_hq) # DeepSeek Resources Source: https://docs.fireworks.ai/deepseek/general-deepseek Access information, blog posts, FAQs, and detailed documentation for DeepSeek v3 and R1. ## 1. How to Access DeepSeek v3 & R1 DeepSeek models are available on Fireworks AI with flexible deployment options. You can test DeepSeek v3 and R1 in an interactive environment without any coding. 🔗 [Try DeepSeek v3 on Fireworks Playground](https://fireworks.ai/playground?model=deepseek-v3)\ 🔗 [Try DeepSeek R1 on Fireworks Playground](https://fireworks.ai/playground?model=deepseek-r1) Developers can integrate DeepSeek models into applications using Fireworks' API. 🔗 [Fireworks API Reference](https://docs.fireworks.ai/api-reference/introduction)\ 🔗 [Using reasoning JSON mode](https://docs.fireworks.ai/structured-responses/structured-response-formatting#reasoning-model-json-mode) * **Serverless API** – Instantly access DeepSeek models with pay-as-you-go pricing. * **Dedicated Deployments** – Higher throughput and lower latency for enterprise use. Contact us at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai) *** ## 2. General FAQs Below are common questions about DeepSeek models on Fireworks, organized by category. #### Model Integrity & Modifications No, Fireworks hosts the unaltered versions of DeepSeek models. * ❌ No quantization – Full-precision versions are hosted. * ❌ No additional censorship – Fireworks does not apply additional content moderation beyond DeepSeek’s built-in policies. * ❌ No forced system prompt – Users have full control over prompts. 🔹 Fireworks hosts DeepSeek R1 and V3 models on Serverless. Contact us at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai) if you need a dedicated deployment. 🔹 Fireworks also offers six R1-distill models released by DeepSeek on on-demand. #### Data Privacy & Hosting Locations Fireworks has zero-data retention by default and does not log or store prompt or generation data. See [Fireworks Data Handling Policy](https://docs.fireworks.ai/guides/security_compliance/data_handling) for details. Fireworks hosts DeepSeek models on servers in North America and the EU. Fireworks hosts DeepSeek models on servers in North America and the EU.\ Fireworks has zero-data retention by default and does not log or store prompt or generation data. See [Fireworks Data Handling Policy](https://docs.fireworks.ai/guides/security_compliance/data_handling) for details. The company DeepSeek does not have access to user API requests or outputs. #### Pricing & Cost Considerations Fireworks hosts DeepSeek models on our own infrastructure. We do not proxy requests to DeepSeek API. We are continuously optimizing the model for speed and throughput. We also offer useful developer features like JSON mode, structured outputs, and dedicated deployment options. Yes, Fireworks offers dedicated deployments for DeepSeek models.\ Contact us at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai) if you need a dedicated deployment. * 🚀 Lower latency – Dedicated instances have better response times than shared serverless. * 📈 Higher throughput – More consistent performance for large-scale applications. * 💰 Pricing: Depends on workload, contact us at [inquiries@fireworks.ai](mailto:inquiries@fireworks.ai). #### Output Control & Limits Yes! Fireworks supports structured outputs through: * ✔️ **JSON Mode** – Enforce JSON responses for structured applications. * ✔️ **Grammar Mode** – Define syntactic constraints for predictable outputs. Currently, DeepSeek R1 does not support native function calling like OpenAI models.\ However: * Users can implement function calling logic via prompt engineering or structured output parsing. * Fireworks is evaluating future support for function calling in DeepSeek models. Max token length for DeepSeek models is only limited by the context window of the model, which is **128K tokens**. If responses are cut off, try increasing `max_tokens` in your API call:\ 🔗 [Fireworks Max Tokens Documentation](https://docs.fireworks.ai/guides/querying-text-models#max-tokens) #### Parsing & API Response Handling DeepSeek R1 uses `` tags to denote reasoning before the final structured output. Fireworks defaults to the simplest approach of returning `...` in the response and letting the user parse the response, such as using regex parsing. #### Roadmap & Feature Requests Fireworks updates DeepSeek R1 and v3 in alignment with DeepSeek AI’s official releases and Fireworks' own performance optimizations. Updates include bug fixes, efficiency improvements, and potential model refinements. Users can track updates through Fireworks documentation and announcements. 🔗 For the latest version information, refer to the [Fireworks API documentation](https://docs.fireworks.ai) or join the [Fireworks community Discord](https://discord.com/invite/fireworks-ai). #### General Troubleshooting If you're encountering an error while using DeepSeek v3 on Fireworks, follow these steps: ✅ **Step 1:** Check Fireworks' [Status Page](https://status.fireworks.ai) for any ongoing outages. ✅ **Step 2:** Verify API request formatting. Ensure: * No missing/invalid API keys * Proper request format * No exceeded rate limits or context window ✅ **Step 3:** Reduce request complexity if your request is too long. ✅ **Step 4:** Adjust parameters if experiencing instability: * Lower **temperature** for more deterministic responses * Adjust **top\_p** to control randomness * Increase **max\_tokens** to avoid truncation ✅ **Step 5:** Contact Fireworks support via: * 🔗 [Fireworks Support](https://docs.fireworks.ai/support) * 🔗 [Fireworks Discord](https://discord.com/invite/fireworks-ai) for real-time help. DeepSeek v3 and R1, like other LLMs, have a **fixed maximum context length of 128K tokens**.\ If responses are getting cut off: 🔹 **Possible Causes & Solutions:** 1️⃣ Exceeded `max_tokens` setting → 🔧 Increase `max_tokens` 2️⃣ Requesting too much text in a single prompt → 🔧 Break input into smaller chunks 3️⃣ Model context window limit reached → 🔧 Summarize prior messages before appending new ones 💡 **Fix:** ```python response = client.chat.completions.create( model="accounts/fireworks/models/deepseek-v3", messages=[{"role": "user", "content": "Generate a long article summary"}], max_tokens=4096, # Adjust as needed ) ``` 📌 **Alternative Fix:** If you need longer responses, re-prompt the model with the last part of the output and ask it to continue. Intermittent API response issues could be due to:\ 🔹 **Common Causes & Fixes:** 1️⃣ **High Server Load** – Fireworks may be experiencing peak traffic.\ **Fix:** Retry the request after a few seconds or try during non-peak hours. 2️⃣ **Rate Limits or Spend Limits Reached** – If you've exceeded the API rate limits, requests may temporarily fail.\ **Fix:** Check your rate limits and spend limits in the API dashboard and adjust your usage accordingly.\ 🔗 To increase spend limits, add credits: [Fireworks Spend Limits](https://docs.fireworks.ai/guides/quotas_usage/rate-limits#spend-limits) 3️⃣ **Network Connectivity Issues** – Fireworks API may be unreachable due to network issues.\ **Fix:** Restart your internet connection or use a different network/VPN. 📌 **If problems persist, check Fireworks' [status page](https://status.fireworks.ai) or reach out via our [Discord](https://discord.com/invite/fireworks-ai).** 🚀 *** ## 3. Learn about R1 & V3 Stay up to date with the latest advancements and insights into DeepSeek models. Check out our blog, where experts from Fireworks breakdown everything you need to know about R1 and V3 A deep dive into DeepSeek R1's capabilities, architecture, and use cases. Explore how DeepSeek v3 now supports vision capabilities through document inlining. Learn how reinforcement learning with verifiable rewards is shaping AI training. Learn about the distillation process for DeepSeek R1 and how it impacts reasoning capabilities. Discover how structured output techniques like reasoning mode improve AI responses. We've also published videos on our youtube channel