Fireworks hosts many embedding models. Let’s walk through an example of using nomic-ai/nomic-embed-text-v1.5 to see how to query Fireworks with embeddings API.

Embedding documents

The embedding model inputs text and outputs a vector (list) of floating point numbers to use for tasks like similarity comparisons and search. Our embedding service is OpenAI compatible. Refer to OpenAI’s embeddings guide and OpenAI’s embeddings documentation for more information on using these models.

Python (OpenAI 1.x)
import openai

client = openai.OpenAI(
    base_url = "https://api.fireworks.ai/inference/v1",
    api_key="<FIREWORKS_API_KEY>",
)
response = client.embeddings.create(
  model="nomic-ai/nomic-embed-text-v1.5",
  input="search_document: Spiderman was a particularly entertaining movie with...",
)

print(response)

This code embeds the text search_document: Spiderman was a particularly entertaining movie with... and returns the following

Response
CreateEmbeddingResponse(data=[Embedding(embedding=[0.006380197126418352, 0.011841800063848495,...], index=0, object='embedding')], model='intfloat/e5-mistral-7b-instruct', object='list', usage=Usage(prompt_tokens=12, total_tokens=12))

Embedding queries and document

In the previous example, you might have noticed the search_document: prefix. Nomic models have been fine-tuned to take prefixes, so for user queries, you will need add thesearch_query: prefix, and for documents, you need to prefix with search_document:

Here’s a quick example:

  • Let’s say we previously used the embedding model to embed many movie reviews that we stored in a vector database. All the documents should have been prefixed withsearch_document:
  • We now want to create a movie recommendation that takes in a user query and outputs recommendations based on this data. The code below demonstrates how to embed the user query and system prompt.
Python (OpenAI 1.x)
import openai

client = openai.OpenAI(
    base_url = "https://api.fireworks.ai/inference/v1",
    api_key="<FIREWORKS_API_KEY>",
)

query = "I love superhero movies, any recommendations?"
task_description="Given a user query for movies, retrieve the relevant movie that can fulfill the query. "
query_emb = client.embeddings.create(
  model="nomic-ai/nomic-embed-text-v1.5",
  input=f"search_query: {query}"
)
print(query_emb)

To view this example end-to-end and see how to use a MongoDB vector store and Fireworks-hosted generation model for RAG, see our full guide. For more information on what kind of prefixes are possible with nomic, please check out this guide from nomic.

Variable dimensions

The model also supports variable embedding dimension sizes. In this case, we can provide dimension as a query to the embeddings.create() request

Python (OpenAI 1.x)
import openai
client = openai.OpenAI(
  base_url="https://api.fireworks.ai/inference/v1",
  api_key="<FIREWORKS_API_KEY>",
)

response = client.embeddings.create(
  model="nomic-ai/nomic-embed-text-v1.5",
  input="search_document: I like Christmas movies, can you make any recommendations?",
  dimensions=128,
)
print(len(response.data[0].embedding))

You will see that the returned results are embeddings with dimension 128.

List of available models

Model namemodel size
nomic-ai/nomic-embed-text-v1.5 (recommended)137M
nomic-ai/nomic-embed-text-v1137M
WhereIsAI/UAE-Large-V1335M
thenlper/gte-large335M
thenlper/gte-base109M