Get text embeddings

This document describes how to create a text embedding using the Vertex AI Text embeddings API .

Vertex AI text embeddings API uses dense vector representations: gemini-embedding-001 , for example, uses 3072-dimensional vectors. Dense vector embedding models use deep-learning methods similar to the ones used by large language models. Unlike sparse vectors, which tend to directly map words to numbers, dense vectors are designed to better represent the meaning of a piece of text. The benefit of using dense vector embeddings in generative AI is that instead of searching for direct word or syntax matches, you can better search for passages that align to the meaning of the query, even if the passages don't use the same language.

The vectors are normalized, so you can use cosine similarity, dot product, or Euclidean distance to provide the same similarity rankings.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. Choose a task type for your embeddings job.

Supported models

You can get text embeddings by using the following models:

Model name Description Output Dimensions Max sequence length Supported text languages
gemini-embedding-001
State-of-the-art performance across English, multilingual and code tasks. It unifies the previously specialized models like text-embedding-005 and text-multilingual-embedding-002 and achieves better performance in their respective domains. Read our Tech Report for more detail. up to 3072 2048 tokens Supported text languages
text-embedding-005
Specialized in English and code tasks. up to 768 2048 tokens English
text-multilingual-embedding-002
Specialized in multilingual tasks. up to 768 2048 tokens Supported text languages

For superior embedding quality, gemini-embedding-001 is our large model designed to provide the highest performance.

Get text embeddings for a snippet of text

You can get text embeddings for a snippet of text by using the Vertex AI API or the Vertex AI SDK for Python.

API limits

For each request, you're limited to 250 input texts. The API has a maximum input token limit of 20,000. Inputs exceeding this limit result in a 400 error. Each individual input text is further limited to 2048 tokens; any excess is silently truncated. You can also disable silent truncation by setting autoTruncate to false .

For more information, see Text embedding limits .

Choose an embedding dimension

All models produce a full-length embedding vector by default. For gemini-embedding-001 , this vector has 3072 dimensions, and other models produce 768-dimensional vectors. However, by using the output_dimensionality parameter, users can control the size of the output embedding vector. Selecting a smaller output dimensionality can save storage space and increase computational efficiency for downstream applications, while sacrificing little in terms of quality.

The following examples use the gemini-embedding-001 model.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  from 
  
 google 
  
 import 
 genai 
 from 
  
 google.genai.types 
  
 import 
 EmbedContentConfig 
 client 
 = 
 genai 
 . 
 Client 
 () 
 response 
 = 
 client 
 . 
 models 
 . 
 embed_content 
 ( 
 model 
 = 
 "gemini-embedding-001" 
 , 
 contents 
 = 
 [ 
 "How do I get a driver's license/learner's permit?" 
 , 
 "How long is my driver's license valid for?" 
 , 
 "Driver's knowledge test study guide" 
 , 
 ], 
 config 
 = 
 EmbedContentConfig 
 ( 
 task_type 
 = 
 "RETRIEVAL_DOCUMENT" 
 , 
 # Optional 
 output_dimensionality 
 = 
 3072 
 , 
 # Optional 
 title 
 = 
 "Driver's License" 
 , 
 # Optional 
 ), 
 ) 
 print 
 ( 
 response 
 ) 
 # Example response: 
 # embeddings=[ContentEmbedding(values=[-0.06302902102470398, 0.00928034819662571, 0.014716853387653828, -0.028747491538524628, ... ], 
 # statistics=ContentEmbeddingStatistics(truncated=False, token_count=13.0))] 
 # metadata=EmbedContentMetadata(billable_character_count=112) 
 

Go

Learn how to install or update the Go .

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  import 
  
 ( 
 "context" 
 "fmt" 
 "io" 
 "google.golang.org/genai" 
 ) 
 // 
 generateEmbedContentWithText 
 shows 
 how 
 to 
 embed 
 content 
 with 
 text 
 . 
 func 
 generateEmbedContentWithText 
 ( 
 w 
 io 
 . 
 Writer 
 ) 
 error 
 { 
 ctx 
 := 
 context 
 . 
 Background 
 () 
 client 
 , 
 err 
 := 
 genai 
 . 
 NewClient 
 ( 
 ctx 
 , 
& genai 
 . 
 ClientConfig 
 { 
 HTTPOptions 
 : 
 genai 
 . 
 HTTPOptions 
 { 
 APIVersion 
 : 
 "v1" 
 }, 
 }) 
 if 
 err 
 != 
 nil 
 { 
 return 
 fmt 
 . 
 Errorf 
 ( 
 "failed to create genai client: %w" 
 , 
 err 
 ) 
 } 
 outputDimensionality 
 := 
 int32 
 ( 
 3072 
 ) 
 config 
 := 
& genai 
 . 
 EmbedContentConfig 
 { 
 TaskType 
 : 
 "RETRIEVAL_DOCUMENT" 
 , 
 // 
 optional 
 Title 
 : 
 "Driver's License" 
 , 
 // 
 optional 
 OutputDimensionality 
 : 
& outputDimensionality 
 , 
 // 
 optional 
 } 
 contents 
 := 
 [] 
 * 
 genai 
 . 
 Content 
 { 
 { 
 Parts 
 : 
 [] 
 * 
 genai 
 . 
 Part 
 { 
 { 
 Text 
 : 
 "How do I get a driver's license/learner's permit?" 
 , 
 }, 
 { 
 Text 
 : 
 "How long is my driver's license valid for?" 
 , 
 }, 
 { 
 Text 
 : 
 "Driver's knowledge test study guide" 
 , 
 }, 
 }, 
 Role 
 : 
 "user" 
 , 
 }, 
 } 
 modelName 
 := 
 "gemini-embedding-001" 
 resp 
 , 
 err 
 := 
 client 
 . 
 Models 
 . 
 EmbedContent 
 ( 
 ctx 
 , 
 modelName 
 , 
 contents 
 , 
 config 
 ) 
 if 
 err 
 != 
 nil 
 { 
 return 
 fmt 
 . 
 Errorf 
 ( 
 "failed to generate content: %w" 
 , 
 err 
 ) 
 } 
 fmt 
 . 
 Fprintln 
 ( 
 w 
 , 
 resp 
 ) 
 // 
 Example 
 response 
 : 
 // 
 embeddings 
 = 
 [ 
 ContentEmbedding 
 ( 
 values 
 = 
 [ 
 - 
 0.06302902102470398 
 , 
 0.00928034819662571 
 , 
 0.014716853387653828 
 , 
 - 
 0.028747491538524628 
 , 
 ... 
 ], 
 // 
 statistics 
 = 
 ContentEmbeddingStatistics 
 ( 
 truncated 
 = 
 False 
 , 
 token_count 
 = 
 13.0 
 ))] 
 // 
 metadata 
 = 
 EmbedContentMetadata 
 ( 
 billable_character_count 
 = 
 112 
 ) 
 return 
 nil 
 } 
 

Add an embedding to a vector database

After you've generated your embedding you can add embeddings to a vector database, like Vector Search. This enables low-latency retrieval, and is critical as the size of your data increases.

To learn more about Vector Search, see Overview of Vector Search .

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: