This page explains how you can ground responses by using your data from Vertex AI Search.
Grounding Gemini to your data
If you want to do retrieval-augmented generation (RAG), connect your model to your website data or your sets of documents, then use Grounding with Vertex AI Search .
Grounding to your data supports a maximum of 10 Vertex AI Search data sources and can be combined with Grounding with Google Search .
Supported models
This section lists the models that support grounding with your data.
- Gemini 3 Pro Preview model
- Gemini 3 Pro Image Preview model
- Gemini 2.5 Pro
- Gemini 2.5 Flash Preview model
- Gemini 2.5 Flash-Lite Preview model
- Gemini 2.5 Flash
- Gemini 2.5 Flash-Lite
- Gemini 2.5 Flash with Gemini Live API native audio
- Gemini 2.5 Flash with Live API native audio (Preview) Preview model
- Gemini 2.0 Flash with Live API Preview model
- Gemini 2.0 Flash
Prerequisites
Before you can ground model output to your data, do the following:
-
In the Google Cloud console, go to the IAMpage, and search for the
discoveryengine.servingConfigs.searchpermission, which is required for the grounding service to work.To get the permissions that you need to use grounding with Vertex AI Search, ask your administrator to grant you the following IAM roles:
-
To read all Discovery Engine resources: Discovery Engine Viewer(
roles/discoveryengine.viewer). -
To read and write all Discovery Engine resources and to create a Vertex AI Search instance: Discovery Engine Editor(
roles/discoveryengine.editor).
For more information about IAM, see IAM roles and permissions .
-
-
Enable AI Applications and activate the API.
-
Create a AI Applications data source and application.
For more information, see the Introduction to Vertex AI Search .
Enable AI Applications
To use Vertex AI Search to ground your responses, you must activate the Vertex AI Search service by following these steps:
-
In the Google Cloud console, go to the AI Applicationspage.
-
Optional: Review the terms for data use .
AI Applications is available in the global
location or the eu
and us
multi-region. To learn more, see AI Applications
locations
.
Create a data store in AI Applications
To create a data store in AI Applications, you can choose to ground with website data or documents.
Website
-
Open the Create Data Store page from the Google Cloud console.
-
In Website Contentbox, click Select .
Specify the websites for your data storepane displays. -
If Advanced website indexingisn't checked, then select the Advanced website indexingcheckbox to turn it on.
Configure your data storepane displays. -
In the Specify URL patterns to indexsection, do the following:
- Add URLs for Sites to include.
- Optional: Add URLs for Sites to exclude.
-
Click Continue.
-
In the Configure your data storepane,
- Select a value from the Location of your data storelist.
- Enter a name in the Your data store namefield. The ID is generated. Use this ID when you generate your grounded responses with your data store. For more information, see Generate grounded responses with your data store .
- Click Create.
Documents
-
Open the Create Data Store page from the Google Cloud console.
-
In Cloud Storagebox, click Select .
Import data from Cloud Storagepane displays. -
In the Unstructured documents (PDF, HTML, TXT and more)section, select Unstructured documents (PDF, HTML, TXT and more).
-
Select a Synchronization frequencyoption.
-
Select a Select a folder or a file you want to importoption, and enter the path in the field.
-
Click Continue.
Configure your data storepane displays. -
In the Configure your data storepane,
- Select a value from the Location of your data storelist.
- Enter a name in the Your data store namefield. The ID is generated.
- To select parsing and chunking options for your documents, expand the Document Processing Optionssection. For more information about different parsers, see Parse documents .
- Click Create.
-
Click Create.
Generate grounded responses with your data store
Use the following instructions to ground a model with your data. A maximum of 10 data stores is supported.
If you don't know your data store ID, follow these steps:
-
In the Google Cloud console, go to the AI Applicationspage and in the navigation menu, click Data stores.
-
Click the name of your data store.
-
On the Data page for your data store, get the data store ID.
Console
To ground your model output to AI Applications by using Vertex AI Studio in the Google Cloud console, follow these steps:
- In the Google Cloud console, go to the Vertex AI Studio page.
- To turn on grounding, follow these steps:
- Click + Newand Chatfrom the navigation menu.
- Expand the Model settingspane, and select your model.
- Optional: If Structured outputor Grounding: Googletoggle is on, turn the option off.
- Click the Grounding: Your datatoggle. The Customize Groundingpane appears.
- Select a grounding source option from the following table:
- Enter a value into the Elasticsearch endpoint field.
- Enter a value into the Elasticsearch API Key field.
- Enter a value into the Elasticsearch index field.
- Enter a value into the Elasticsearch search template field.
- Click Save.
Grounding optionDescriptionInputVertex AI RAG EngineGrounds using your data and do-it-yourself components.If you don't have a corpus, you must create one. Otherwise, enter your corpus.Vertex AI SearchGrounds using your data with a Google-managed search engine.Enter your path into the Vertex AI datastore path field.ElasticsearchGrounds using Elasticsearch.Enter the following information:
- Enter your prompt in the text box, and click Submit. Your prompt responses are grounded in AI Applications.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation .
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT = GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION = global export GOOGLE_GENAI_USE_VERTEXAI = True
from
google
import
genai
from
google.genai.types
import
(
GenerateContentConfig
,
VertexAISearch
,
Retrieval
,
Tool
,
HttpOptions
,
)
client
=
genai
.
Client
(
http_options
=
HttpOptions
(
api_version
=
"v1"
))
# Replace with your Vertex AI Search data store details
DATASTORE_PATH
=
"projects/ PROJECT_ID
/locations/global/collections/default_collection/dataStores/DATASTORE_ID"
tool
=
Tool
(
retrieval
=
Retrieval
(
vertex_ai_search
=
VertexAISearch
(
datastore
=
DATASTORE_PATH
)
)
)
response
=
client
.
models
.
generate_content
(
model
=
"gemini-2.5-flash"
,
# Or another supported model
contents
=
"What information can you find about topic X in the provided documents?"
,
# Your query
config
=
GenerateContentConfig
(
tools
=
[
tool
],
),
)
print
(
response
.
text
)
REST
To test a text prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- LOCATION
: The region to process the request. To use the
globalendpoint , exclude the location from the endpoint name, and configure the location of the resource toglobal. - PROJECT_ID : Your project ID .
- MODEL_ID : The model ID of the multimodal model.
- PROMPT : The prompt to send to the model.
HTTP method and URL:
POST https:// LOCATION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /publishers/google/models/ MODEL_ID :generateContent
Request JSON body:
{ "contents": [{ "role": "user", "parts": [{ "text": " PROMPT " }] }], "tools": [{ "retrieval": { "vertexAiSearch": { "datastore": projects/ PROJECT_ID /locations/global/collections/default_collection/dataStores/ DATASTORE_ID } } }], "model": "projects/ PROJECT_ID /locations/ LOCATION /publishers/google/models/ MODEL_ID " }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": "You can make an appointment on the website https://dmv.gov/"
}
]
},
"finishReason": "STOP",
"safetyRatings": [
"..."
],
"groundingMetadata": {
"retrievalQueries": [
"How to make appointment to renew driving license?"
],
"groundingChunks": [
{
"retrievedContext": {
"uri": "https://vertexaisearch.cloud.google.com/grounding-api-redirect/AXiHM.....QTN92V5ePQ==",
"title": "dmv"
}
}
],
"groundingSupport": [
{
"segment": {
"startIndex": 25,
"endIndex": 147
},
"segment_text": "ipsum lorem ...",
"supportChunkIndices": [1, 2],
"confidenceScore": [0.9541752, 0.97726375]
},
{
"segment": {
"startIndex": 294,
"endIndex": 439
},
"segment_text": "ipsum lorem ...",
"supportChunkIndices": [1],
"confidenceScore": [0.9541752, 0.9325467]
}
]
}
}
],
"usageMetadata": {
"..."
}
}
Understand your response
The response from both APIs include the LLM-generated text, which is called a candidate . If your model prompt successfully grounds to your data source, then the responses include grounding metadata, which identifies the parts of the response that were derived from your data. However, there are several reasons this metadata may not be provided, and the prompt response won't be grounded. These reasons include low-source relevance or incomplete information within the model's response.
The following is a breakdown of the output data:
- Role: Indicates the sender of the grounded answer. Because the response
always contains grounded text, the role is always
model. - Text: The grounded answer generated by the LLM.
- Grounding metadata: Information about the grounding source, which contains
the following elements:
- Grounding chunks: A list of results from your index that support the answer.
- Grounding supports: Information about a specific claim within the answer that can be used to show citations:
- Segment: The part of the model's answer that is substantiated by a grounding chunk.
- Grounding chunk index: The index of the grounding chunks in the grounding chunks list that corresponds to this claim.
- Confidence scores: A number from 0 to 1 that indicates how grounded the claim is in the provided set of grounding chunks. Not available for Gemini 2.5 and later.
What's next
- To learn how to send chat prompt requests, see Multiturn chat .
- To learn about responsible AI best practices and Vertex AI's safety filters, see Safety best practices .

