You can use Vector Search 2.0 to autogenerate embeddings for your Collections . This lets you build new embeddings and deploy them instantly, which streamlines the path from raw data to a live production-scale search engine.
Supported Vertex AI embedding models
Vector Search 2.0 supports the following embedding models:
-
Gemini — Provides state-of-the-art performance for embedding text (English-only and multilingual).
-
Text — Specializes in (English-only and multilingual) and source code data.
The following table provides details on each supported model.
| Model | Description | Max output dimensions |
Max sequence length (tokens) |
Supported modalities and text languages |
Additional limits |
|---|---|---|---|---|---|
gemini-embedding-001
|
State-of-the-art performance across English, multilingual and code tasks. It unifies the previously specialized models like text-embedding-005
andtext-multilingual-embedding-002
and achieves better performance in theirrespective domains. |
3072 | 2048 | Supported text languages | Embedding limits |
gemini-embedding-2-preview
|
This is a next-generation multimodal embedding model from Google. Built on on the latest Gemini model architecture, this "omni embedding model" maps text, image, video, and PDF data into a single, unified embedding space. |
3072 | 8192 | Interleaved text, image, video, and PDF | API limits |
text-embedding-004
|
Specialized in English and code tasks. | 768 | 2048 | English | API limits |
text-embedding-005
|
Specialized in English and code tasks. | 768 | 2048 | English | API limits |
text-multilingual-embedding-002
|
Specialized in multilingual tasks. | 768 | 2048 | Supported text languages | API limits |
Creating Collections with autogenerated embeddings
When creating a Collection, specify the embedding model in the model_id
field of vertex_embedding_config
. This model is used whenever a Data Object
is created without genre_embedding
data defined.
The following code demonstrates how to specify the embedding model to use when autogenerating embeddings.
request
=
vectorsearch
.
CreateCollectionRequest
(
parent
=
f
"projects/
{
PROJECT_ID
}
/locations/
{
LOCATION
}
"
,
collection_id
=
collection_id
,
collection
=
{
"data_schema"
:
{
"type"
:
"object"
,
"properties"
:
{
"year"
:
{
"type"
:
"number"
},
"genre"
:
{
"type"
:
"string"
},
"director"
:
{
"type"
:
"string"
},
"title"
:
{
"type"
:
"string"
},
},
},
"vector_schema"
:
{
"plot_embedding"
:
{
"dense_vector"
:
{
"dimensions"
:
3
}
},
"soundtrack_embedding"
:
{
"dense_vector"
:
{
"dimensions"
:
5
}
},
"genre_embedding"
:
{
"dense_vector"
:
{
"dimensions"
:
4
,
"vertex_embedding_config"
:
{
# If a data object is created without a supplied value for genre_embedding, it will be
# auto-generated based on this config.
"model_id"
:
"text-embedding-004"
,
"text_template"
:
(
"Movie:
{title}
Genre:
{genre}
Year:
{year}
"
),
"task_type"
:
"RETRIEVAL_DOCUMENT"
,
},
}
},
"sparse_embedding"
:
{
"sparse_vector"
:
{
}
},
},
},
)
operation
=
vector_search_service_client
.
create_collection
(
request
=
request
)
operation
.
result
()
In the example code, a new Collection is created with the model_id
field set
to text-embedding-004
. See Supported Vertex AI embedding models
for which
embedding models can be specified for model_id
.
Quotas
Autogenerated embeddings rely on on customer quotas for the underlying Vertex AI embedding models. This is primarily constrained by two main quotas:
-
Embed content input tokens per minute per region per base_model.
-
Online prediction requests per base model per minute per region per base_model.
Make sure you have enough quota before creating Data Objects or running an import job.
See Manage your quota using the console for information on how to request larger quotas.

