Skip to main content
Vertex AI documentation is no longer being updated
Vertex AI's services are now part of Gemini Enterprise Agent Platform. See the most up-to-date information in the Agent Platform documentation
.
Send feedback
RAG Engine API Stay organized with collections
Save and categorize content based on your preferences.
The Vertex AI RAG Engine is a component of the
Vertex AI platform, which facilitates Retrieval-Augmented
Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access
and incorporate data from external knowledge sources, such as documents and
databases. By using RAG, LLMs can generate more accurate and informative LLM
responses.
Parameters list
This section lists the following:
Corpus management parameters
For information about a RAG corpus, see Corpus management
.
Create a RAG corpus
This table lists the parameters used to create a RAG corpus.
Body Request
Required: string
The display name of the RAG corpus.
Optional: string
The description of the RAG corpus.
Optional: Immutable: string
The CMEK key name is used to encrypt at-rest data that's related to the RAG corpus. The key name is only applicable to the RagManaged
option for the vector database. When the corpus is created, this field can be set and can't be updated or deleted.
Format: projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{key_name}
Optional: Immutable: vectorDbConfig
The configuration for the Vector DBs.
vertex_ai_search_config.serving_config
Optional: string
The configuration for the Agent Search.
Format: projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{serving_config}
or projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/servingConfigs/{serving_config}
vectorDbConfig
oneof
vector_db
: vectorDbConfig.RagManagedDb
If no vector database is specified, rag_managed_db
is the default vector database.
oneof
vector_db
: vectorDbConfig.Pinecone
Specifies your Pinecone instance.
string
This is the name used to create the Pinecone index that's used with the RAG corpus.
This value can't be changed after it's set. You can leave it empty in
the CreateRagCorpus
API call, and set it with a non-empty
value in a follow up UpdateRagCorpus
API call.
oneof
vector_db
: vectorDbConfig.VertexVectorSearch
Specifies your Vertex Vector Search instance.
vertex_vector_search.index
string
This is the resource name of the Vector Search index that's used with the RAG corpus.
Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}
This value can't be changed after it's set. You can leave it empty in
the CreateRagCorpus
API call, and set it with a non-empty
value in a follow up UpdateRagCorpus
API call.
vertex_vector_search.index_endpoint
string
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus.
Format: projects/{project}/locations/{location}/indexes/{index}
This value can't be changed after it's set. You can leave it empty in
the CreateRagCorpus
API call, and set it with a non-empty
value in a follow up UpdateRagCorpus
API call.
api_auth.api_key_config.api_key_secret_version
string
This the full resource name of the secret that is stored in Secret Manager,
which contains your Pinecone API key.
Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}
You can leave it empty in the CreateRagCorpus
API call, and set it with a non-empty
value in a follow up UpdateRagCorpus
API call.
rag_embedding_model_config.vertex_prediction_endpoint.endpoint
Optional: Immutable: string
The embedding model to use for the RAG corpus. This value can't be
changed after it's set. If you leave it empty, we use text-embedding-005
as the embedding model.
Update a RAG corpus
This table lists the parameters used to update a RAG corpus.
Body Request
Optional: string
The display name of the RAG corpus.
Optional: string
The description of the RAG corpus.
rag_vector_db.pinecone.index_name
string
This is the name used to create the Pinecone index that's used with the RAG corpus.
If your RagCorpus
was created with a Pinecone
configuration, and this field has never been set before, then you can update
the Pinecone instance's index name.
rag_vector_db.vertex_vector_search.index
string
This is the resource name of the Vector Search index that's used with the RAG corpus.
Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}
If your RagCorpus
was created with a Vector Search
configuration, and this field has never been set before, then you can update it.
rag_vector_db.vertex_vector_search.index_endpoint
string
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus.
Format: projects/{project}/locations/{location}/indexes/{index}
If your RagCorpus
was created with a Vector Search
configuration, and this field has never been set before, then you can update it.
rag_vector_db.api_auth.api_key_config.api_key_secret_version
string
The full resource name of the secret that is stored in Secret Manager,
which contains your Pinecone API key.
Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}
List RAG corpora
This table lists the parameters used to list RAG corpora.
Optional: int
The standard list page size.
Optional: string
The standard list page token. Typically obtained from [ListRagCorporaResponse.next_page_token][]
of the previous [VertexRagDataService.ListRagCorpora][]
call.
Get a RAG corpus
This table lists parameters used to get a RAG corpus.
string
The name of the RagCorpus
resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}
Delete a RAG corpus
This table lists parameters used to delete a RAG corpus.
string
The name of the RagCorpus
resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}
File management parameters
For information about a RAG file, see File management
.
Upload a RAG file
This table lists parameters used to upload a RAG file.
Body Request
string
The name of the RagCorpus
resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}
Required: RagFile
The file to upload.
Required: UploadRagFileConfig
The configuration for the RagFile
to be uploaded into the RagCorpus
.
Required: string
The display name of the RAG file.
Optional: string
The description of the RAG file.
rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size
int32
Number of tokens each chunk has.
rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap
int32
The overlap between chunks.
Import RAG files
This table lists parameters used to import a RAG file.
Required: string
The name of the RagCorpus
resource.
Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}
oneof
import_source
: GcsSource
Cloud Storage location.
Supports importing individual files as well as entire Cloud Storage directories.
list
of string
Cloud Storage URI that contains the upload file.
oneof
import_source
: GoogleDriveSource
Google Drive location.
Supports importing individual files as well as Google Drive folders.
oneof
import_source
: SlackSource
The slack channel where the file is uploaded.
oneof
import_source
: JiraSource
The Jira query where the file is uploaded.
oneof
import_source
: SharePointSources
The SharePoint sources where the file is uploaded.
rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size
int32
Number of tokens each chunk has.
rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap
int32
The overlap between chunks.
Optional: RagFileParsingConfig
Specifies the parsing configuration for RagFiles
.
If this field isn't set, RAG uses the default parser.
max_embedding_requests_per_min
Optional: int32
The maximum number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value.
If unspecified, a default value of 1,000 QPM is used.
Required: string
The ID of the Google Drive resource.
resource_ids.resource_type
Required: string
The type of the Google Drive resource.
Repeated: SlackSource.SlackChannels.SlackChannel
Slack channel information, include ID and time range to import.
channels.channels.channel_id
Required: string
The Slack channel ID.
channels.channels.start_time
Optional: google.protobuf.Timestamp
The starting timestamp for messages to import.
channels.channels.end_time
Optional: google.protobuf.Timestamp
The ending timestamp for messages to import.
channels.api_key_config.api_key_secret_version
Required: string
The full resource name of the secret that is stored in Secret Manager,
which contains a Slack channel access token that has access to the slack channel IDs. See: https://api.slack.com/tutorials/tracks/getting-a-token.
Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}
Repeated: string
A list of Jira projects to import in their entirety.
jira_queries.custom_queries
Repeated: string
A list of custom Jira queries to import. For information about JQL (Jira Query Language), see Jira Support
Required: string
The Jira email address.
Required: string
The Jira server URI.
jira_queries.api_key_config.api_key_secret_version
Required: string
The full resource name of the secret that is stored in Secret Manager,
which contains Jira API key that has access to the slack channel IDs. See: https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/
Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}
share_point_sources.sharepoint_folder_path
oneof
in folder_source
: string
The path of the SharePoint folder to download from.
share_point_sources.sharepoint_folder_id
oneof
in folder_source
: string
The ID of the SharePoint folder to download from.
share_point_sources.drive_name
oneof
in drive_source
: string
The name of the drive to download from.
share_point_sources.drive_id
oneof
in drive_source
: string
The ID of the drive to download from.
share_point_sources.client_id
string
The Application ID for the app registered in Microsoft Azure Portal. The application must also be configured with MS Graph permissions
"Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.
share_point_sources.client_secret.api_key_secret_version
Required: string
The full resource name of the secret that is stored in Secret Manager,
which contains the application secret for the app registered in Azure.
Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}
share_point_sources.tenant_id
string
Unique identifier of the Azure Active Directory Instance.
share_point_sources.sharepoint_site_name
string
The name of the SharePoint site to download from. This can be the site name or the site id.
oneof
parser
: RagFileParsingConfig.LayoutParser
The Layout Parser to use for RagFile
s.
layout_parser.processor_name
string
The full resource name of a Document AI processor or processor version.
Format: projects/{project_id}/locations/{location}/processors/{processor_id}
projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}
layout_parser.max_parsing_requests_per_min
string
The maximum number of requests the job is allowed to make to the Document AI processor per minute.
Consult https://cloud.google.com/document-ai/quotas and the Quota page
for your project to set an appropriate value here. If unspecified, a default
value of 120 QPM is used.
oneof
parser
: RagFileParsingConfig.LlmParser
The LLM parser to use for RagFile
s.
string
The resource name of an LLM model.
Format: {publisher}/models/{model}
llm_parser.max_parsing_requests_per_min
string
The maximum number of requests the job is allowed to make to the LLM model per minute.
To set an appropriate value for your project, see model quota section
and the Quota page
for your project to set an appropriate value here. If unspecified, a default
value of 5000 QPM is used.
Get a RAG file
This table lists parameters used to get a RAG file.
string
The name of the RagFile
resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}
Delete a RAG file
This table lists parameters used to delete a RAG file.
string
The name of the RagFile
resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}
Retrieval and prediction parameters
This section lists the retrieval and prediction parameters.
Retrieval parameters
This table lists parameters for retrieveContexts
API.
Required: string
The resource name of the Location to retrieve RagContexts
. The users must have permission to make a call in the project.
Format: projects/{project}/locations/{location}
VertexRagStore
The data source for Vertex RagStore.
Required: RagQuery
Single RAG retrieve query.
VertexRagStore
list: RagResource
The representation of the RAG source. It can be used to specify the corpus
only or RagFile
s. Only support one corpus or multiple files
from one corpus.
Optional: string
RagCorpora
resource name.
Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
rag_resources.rag_file_ids
list: string
A list of RagFile
resources.
Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}
string
The query in text format to get relevant contexts.
Optional: RagRetrievalConfig
The retrieval configuration for the query.
Optional: int32
The number of contexts to retrieve.
filter.vector_distance_threshold
oneof vector_db_threshold
: double
Only returns contexts with a vector distance smaller than the threshold.
filter.vector_similarity_threshold
oneof vector_db_threshold
: double
Only returns contexts with vector similarity larger than the threshold.
ranking.rank_service.model_name
Optional: string
The model name of the rank service.
Example: semantic-ranker-512@latest
ranking.llm_ranker.model_name
Optional: string
The model name used for ranking.
Example: gemini-2.5-flash
Prediction parameters
This table lists prediction parameters.
tools.retrieval.vertex_rag_store
VertexRagStore
Set to use a data source powered by Vertex AI RAG store.
See VertexRagStore
for details.
Project management parameters
This table lists project-level parameters.
RagEngineConfig
Corpus management examples
This section provides examples of how to use the API to manage your RAG corpus.
Create a RAG corpus example
These code samples demonstrate how to create a RAG corpus.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
CORPUS_DISPLAY_NAME
: The display name of the RAG corpus.
CORPUS_DESCRIPTION
: The description of the RAG corpus.
HTTP method and URL:
POST
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora
Request JSON body:
{
"display_name"
:
" CORPUS_DISPLAY_NAME
"
,
"description"
:
" CORPUS_DESCRIPTION
"
,
}
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Save the request body in a file named request.json, and run the
following command:
curl
-X
POST
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json; charset=utf-8"
\
-d
@request.json
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Save the request body in a file named request.json, and run the
following command:
$cred
=
gcloud
auth
print-access-token
$headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
POST
`
-Headers
$headers
`
-ContentType:
"application/json; charset=utf-8"
`
-InFile
request.json
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora"
|
Select-Object
-Expand
Content
You should receive a successful status code (2xx).
The following example demonstrates how to create a RAG corpus by using the REST
API.
//
CreateRagCorpus
//
Input:
LOCATION,
PROJECT_ID,
CORPUS_DISPLAY_NAME
//
Output:
CreateRagCorpusOperationMetadata
curl
-X
POST
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora
\
-d
'{
"display_name" : " CORPUS_DISPLAY_NAME
"
}'
Update a RAG corpus example
You can update your RAG corpus with a new display name, description, and vector
database configuration. However, you can't change the following parameters
in your RAG corpus:
The vector database type. For example, you can't change the vector database
from Weaviate to Vertex AI Feature Store.
If you're using the managed database option, you can't update the vector
database configuration.
These examples demonstrate how to update a RAG corpus.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
CORPUS_ID
: The corpus ID of your RAG corpus.
CORPUS_DISPLAY_NAME
: The display name of the RAG corpus.
CORPUS_DESCRIPTION
: The description of the RAG corpus.
INDEX_NAME
: The resource name of the
Vector Search Index. Format: projects/{project}/locations/{location}/indexes/{index}
.
INDEX_ENDPOINT_NAME
: The resource name of the
Vector Search index endpoint. Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}
.
HTTP method and URL:
PATCH
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ CORPUS_ID
Request JSON body:
{
"display_name"
:
" CORPUS_DISPLAY_NAME
"
,
"description"
:
" CORPUS_DESCRIPTION
"
,
"vector_db_config"
:
{
"vertex_vector_search"
:
{
"index"
:
" INDEX_NAME
"
,
"index_endpoint"
:
" INDEX_ENDPOINT_NAME
"
,
}
}
}
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running gcloud CLI init
or gcloud CLI auth login
, or by using Cloud Shell,
which automatically signs you into the gcloud CLI CLI . You can
check the active account by running gcloud CLI auth list
.
Save the request body in a file named request.json, and run the
following command:
curl
-X
PATCH
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json; charset=utf-8"
\
-d
@request.json
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ CORPUS_ID
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth list
.
Save the request body in a file named request.json, and run the following command:
$cred
=
gcloud
auth
print-access-token
$headers
=
@{
"Authorization"
=
"Bearer $cred"
}
Invoke-WebRequest
`
-Method
PATCH
`
-Headers
$headers
`
-ContentType
:
"application/json; charset=utf-8"
`
-InFile
request
.
json
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ CORPUS_ID
"
|
Select-Object
-Expand
Content
You should receive a successful status code (2xx).
List RAG corpora example
These code samples demonstrate how to list all of the RAG corpora.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
PAGE_SIZE
: The standard list page size. You might adjust
the number of RAG corpora to return per page by updating the page_size
parameter.
PAGE_TOKEN
: The standard list page token. Obtained
typically using ListRagCorporaResponse.next_page_token
of the previous VertexRagDataService.ListRagCorpora
call.
HTTP method and URL:
GET
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora?page_size =
PAGE_SIZE
&page_token =
PAGE_TOKEN
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Run the following command:
curl
-X
GET
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora?page_size= PAGE_SIZE
&page_token= PAGE_TOKEN
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Run the following command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
GET
`
-Headers
$headers
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora?page_size= PAGE_SIZE
&page_token= PAGE_TOKEN
"
|
Select-Object
-Expand
Content
You should receive a successful status code ( 2xx
) and a list of RAG
corpora under the given PROJECT_ID
.
Get a RAG corpus example
These code samples demonstrate how to get a RAG corpus.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The ID of the RAG corpus resource.
HTTP method and URL:
GET
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Run the following command:
curl
-X
GET
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Run the following command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
GET
`
-Headers
$headers
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
"
|
Select-Object
-Expand
Content
A successful response returns the RagCorpus
resource.
The get
and list
commands are used in an example to demonstrate how RagCorpus
uses the rag_embedding_model_config
field with in the vector_db_config
, which points to the embedding model you have chosen.
PROJECT_ID
:
Your
project
ID.
LOCATION
:
The
region
to
process
the
request.
RAG_CORPUS_ID
:
The
corpus
ID
of
your
RAG
corpus.
```
```
sh
//
GetRagCorpus
//
Input:
LOCATION,
PROJECT_ID,
RAG_CORPUS_ID
//
Output:
RagCorpus
curl
-X
GET
\
-H
"Content-Type: application/json"
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
//
ListRagCorpora
curl
-sS
-X
GET
\
-H
"Content-Type: application/json"
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/
```
Delete a RAG corpus example
These code samples demonstrate how to delete a RAG corpus.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The ID of the RagCorpus
resource.
HTTP method and URL:
DELETE
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Run the following command:
curl
-X
DELETE
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Run the following command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
DELETE
`
-Headers
$headers
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
"
|
Select-Object
-Expand
Content
A successful response returns the DeleteOperationMetadata
.
File management examples
This section provides examples of how to use the API to manage RAG files.
Upload a RAG file example
These code samples demonstrate how to upload a RAG file.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The corpus ID of your RAG corpus.
LOCAL_FILE_PATH
: The local path to the file to be
uploaded.
DISPLAY_NAME
: The display name of the RAG file.
DESCRIPTION
: The description of the RAG file.
To send your request, use the following command:
curl
-X
POST
\
-H
"X-Goog-Upload-Protocol: multipart"
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-F
metadata
=
"{'rag_file': {'display_name':' DISPLAY_NAME
', 'description':' DESCRIPTION
'}}"
\
-F
file
=
@ LOCAL_FILE_PATH
\
"https:// LOCATION
-aiplatform.googleapis.com/upload/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles:upload"
Import RAG files example
Files and folders can be imported from Drive or
Cloud Storage. You can use response.metadata
to view partial
failures, request time, and response time in the SDK's response
object.
The response.skipped_rag_files_count
refers to the number of files that
were skipped during import. A file is skipped when the following conditions are
met:
The file has already been imported.
The file hasn't changed.
The chunking configuration for the file hasn't changed.
Python
from
vertexai
import
rag
import
vertexai
# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"] # Supports Cloud Storage and Google Drive Links
# Initialize Vertex AI API once per session
vertexai
.
init
(
project
=
PROJECT_ID
,
location
=
"us-central1"
)
response
=
rag
.
import_files
(
corpus_name
=
corpus_name
,
paths
=
paths
,
transformation_config
=
rag
.
TransformationConfig
(
rag
.
ChunkingConfig
(
chunk_size
=
1024
,
chunk_overlap
=
256
)
),
import_result_sink
=
"gs://sample-existing-folder/sample_import_result_unique.ndjson"
,
# Optional: This must be an existing Cloud Storage bucket folder, and the filename must be unique (non-existent).
llm_parser
=
rag
.
LlmParserConfig
(
model_name
=
"gemini-2.5-pro-preview-05-06"
,
max_parsing_requests_per_min
=
100
,
),
# Optional
max_embedding_requests_per_min
=
900
,
# Optional
)
print
(
f
"Imported
{
response
.
imported_rag_files_count
}
files."
)
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The corpus ID of your RAG corpus.
FOLDER_RESOURCE_ID
: The resource ID of your
Drive folder.
GCS_URIS
: A list of Cloud Storage locations.
Example: gs://my-bucket1
.
CHUNK_SIZE
: Number of tokens each chunk should have.
CHUNK_OVERLAP
: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE
: The QPM rate to limit RAG's
access to your embedding model. Example: 1,000.
HTTP method and URL:
POST
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles:import
Request JSON body:
{
"import_rag_files_config"
:
{
"gcs_source"
:
{
"uris"
:
" GCS_URIS
"
},
"rag_file_chunking_config"
:
{
"chunk_size"
:
" CHUNK_SIZE
"
,
"chunk_overlap"
:
" CHUNK_OVERLAP
"
}
}
}
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Save the request body in a file named request.json, and run the
following command:
curl
-X
POST
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json; charset=utf-8"
\
-d
@request.json
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles:import"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Save the request body in a file named request.json, and run the following
command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
POST
`
-Headers
$headers
`
-ContentType:
"application/json; charset=utf-8"
`
-InFile
request.json
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles:import"
|
Select-Object
-Expand
Content
A successful response returns the ImportRagFilesOperationMetadata
resource.
The following sample demonstrates how to import a file from
Cloud Storage. Use the max_embedding_requests_per_min
control field
to limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles
indexing process. The field has a default value of 1000
calls
per minute.
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The corpus ID of your RAG corpus.
GCS_URIS
: A list of Cloud Storage locations.
Example: gs://my-bucket1
.
CHUNK_SIZE
: Number of tokens each chunk should have.
CHUNK_OVERLAP
: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE
: The QPM rate to limit RAGs
access to your embedding model. Example: 1,000.
//
ImportRagFiles
//
Import
a
single
Cloud
Storage
file
or
all
files
in
a
Cloud
Storage
bucket.
//
Input:
LOCATION,
PROJECT_ID,
RAG_CORPUS_ID,
GCS_URIS
//
Output:
ImportRagFilesOperationMetadataNumber
//
Use
ListRagFiles
to
find
the
server-generated
rag_file_id.
curl
-X
POST
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles:import
\
-d
'{
"import_rag_files_config": {
"gcs_source": {
"uris": " GCS_URIS
"
},
"rag_file_chunking_config": {
"chunk_size": CHUNK_SIZE
,
"chunk_overlap": CHUNK_OVERLAP
},
"max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
}
}'
The following sample demonstrates how to import a file from
Drive. Use the max_embedding_requests_per_min
control field to
limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles
indexing process. The field has a default value of 1000
calls
per minute.
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The corpus ID of your RAG corpus.
FOLDER_RESOURCE_ID
: The resource ID of your
Drive folder.
CHUNK_SIZE
: Number of tokens each chunk should have.
CHUNK_OVERLAP
: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE
: The QPM rate to limit RAG's
access to your embedding model. Example: 1,000.
//
ImportRagFiles
//
Import
all
files
in
a
Google
Drive
folder.
//
Input:
LOCATION,
PROJECT_ID,
RAG_CORPUS_ID,
FOLDER_RESOURCE_ID
//
Output:
ImportRagFilesOperationMetadataNumber
//
Use
ListRagFiles
to
find
the
server-generated
rag_file_id.
curl
-X
POST
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles:import
\
-d
'{
"import_rag_files_config": {
"google_drive_source": {
"resource_ids": {
"resource_id": " FOLDER_RESOURCE_ID
",
"resource_type": "RESOURCE_TYPE_FOLDER"
}
},
"max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
}
}'
List RAG files example
These code samples demonstrate how to list RAG files.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The ID of the RagCorpus
resource.
PAGE_SIZE
: The standard list page size. You might adjust
the number of RagFiles
to return per page by updating the page_size
parameter.
PAGE_TOKEN
: The standard list page token. Obtained using ListRagFilesResponse.next_page_token
of the previous VertexRagDataService.ListRagFiles
call.
HTTP method and URL:
GET
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles?page_size =
PAGE_SIZE
&page_token =
PAGE_TOKEN
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Run the following command:
curl
-X
GET
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles?page_size= PAGE_SIZE
&page_token= PAGE_TOKEN
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Run the following command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
GET
`
-Headers
$headers
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles?page_size= PAGE_SIZE
&page_token= PAGE_TOKEN
"
|
Select-Object
-Expand
Content
You should receive a successful status code (2xx) along with a list of RagFiles
under the given RAG_CORPUS_ID
.
Get a RAG file example
These code samples demonstrate how to get a RAG file.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The ID of the RagCorpus
resource.
RAG_FILE_ID
: The ID of the RagFile
resource.
HTTP method and URL:
GET
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles/ RAG_FILE_ID
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Run the following command:
curl
-X
GET
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles/ RAG_FILE_ID
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Run the following command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
GET
`
-Headers
$headers
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles/ RAG_FILE_ID
"
|
Select-Object
-Expand
Content
A successful response returns the RagFile
resource.
Delete a RAG file example
These code samples demonstrate how to delete a RAG file.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
>: Your project ID.
LOCATION
: The region to process the request.
RAG_CORPUS_ID
: The ID of the RagCorpus resource.
RAG_FILE_ID
: The ID of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}
.
HTTP method and URL:
DELETE
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles/ RAG_FILE_ID
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Run the following command:
curl
-X
DELETE
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles/ RAG_FILE_ID
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Run the following command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
DELETE
`
-Headers
$headers
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/ragCorpora/ RAG_CORPUS_ID
/ragFiles/ RAG_FILE_ID
"
|
Select-Object
-Expand
Content
Retrieval query example
When a user asks a question or provides a prompt, the retrieval component in RAG
searches through its knowledge base to find information that is relevant to the
query.
REST
Before using any of the request data, make the following replacements:
LOCATION
: The region to process the request.
PROJECT_ID
: Your project ID.
TEXT
: The query text to get relevant contexts.
SIMILARITY_TOP_K
: The number of top contexts to
retrieve.
VECTOR_DISTANCE_THRESHOLD
: Only contexts with a vector
distance smaller than the threshold are returned.
RAG_CORPUS_RESOURCE
: The name of the RagCorpus
resource.
Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
.
HTTP method and URL:
POST
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
:retrieveContexts
Request JSON body:
{
"query"
:
{
"text"
:
TEXT
,
"ragRetrievalConfig"
:
{
"topK"
:
SIMILARITY_TOP_K
,
"filter"
:
{
"vectorDistanceThreshold"
:
VECTOR_DISTANCE_THRESHOLD
}
},
"vertex_rag_store"
:
{
"rag_resources"
:
{
"rag_corpus"
:
" RAG_CORPUS_RESOURCE
"
}
}
}
}
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Save the request body in a file named request.json, and run the following
command:
curl
-X
POST
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json; charset=utf-8"
\
-d
@request.json
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
:retrieveContexts"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Save the request body in a file named request.json, and run the following
command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
POST
`
-Headers
$headers
`
-ContentType:
"application/json; charset=utf-8"
`
-InFile
request.json
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
:retrieveContexts"
|
Select-Object
-Expand
Content
You should receive a successful status code (2xx) and a list of related RagFiles
.
Generation example
The LLM generates a grounded response using the retrieved contexts.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your project ID.
LOCATION
: The region to process the request.
MODEL_ID
: LLM model for content generation. Example: gemini-2.5-flash
.
GENERATION_METHOD
: LLM method for content generation.
Options: generateContent
, streamGenerateContent
.
INPUT_PROMPT
: The text sent to the LLM for content
generation. Try to use a prompt relevant to the uploaded rag Files.
RAG_CORPUS_RESOURCE
: The name of the RagCorpus
resource.
Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
.
SIMILARITY_TOP_K
: Optional: The number of top contexts to
retrieve.
VECTOR_DISTANCE_THRESHOLD
: Optional: Contexts with a
vector distance smaller than the threshold are returned.
USER
: Your username.
HTTP method and URL:
POST
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/publishers/google/models/ MODEL_ID
: GENERATION_METHOD
Request JSON body:
{
"contents"
:
{
"role"
:
" USER
"
,
"parts"
:
{
"text"
:
" INPUT_PROMPT
"
}
},
"tools"
:
{
"retrieval"
:
{
"disable_attribution"
:
false
,
"vertex_rag_store"
:
{
"rag_resources"
:
{
"rag_corpus"
:
" RAG_CORPUS_RESOURCE
"
},
"similarity_top_k"
:
" SIMILARITY_TOP_K
"
,
"vector_distance_threshold"
:
VECTOR_DISTANCE_THRESHOLD
}
}
}
}
To send your request, choose one of these options:
curl
Note: The following command assumes that you have signed in to the
Google Cloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
, or
by using Cloud Shell, which automatically signs you into the
gcloud CLI CLI . You can check the active account by running
gcloud CLI auth list
.
Save the request body in a file named request.json, and execute the following
command:
curl
-X
POST
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json; charset=utf-8"
\
-d
@request.json
\
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/publishers/google/models/ MODEL_ID
: GENERATION_METHOD
"
Powershell
Note: The following command assumes that you have signed in to the
gcloud CLI CLI with your user account by running
gcloud CLI init
or gcloud CLI auth login
. You
can check the active account by running gcloud CLI auth
list
.
Save the request body in a file named request.json, and execute the following
command:
$cred
=
gcloud
auth
print-access-token $headers
=
@ {
"Authorization"
=
"Bearer
$cred
"
}
Invoke-WebRequest
`
-Method
POST
`
-Headers
$headers
`
-ContentType:
"application/json; charset=utf-8"
`
-InFile
request.json
`
-Uri
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/publishers/google/models/ MODEL_ID
: GENERATION_METHOD
"
|
Select-Object
-Expand
Content
A successful response returns the generated content with citations.
Project management examples
Tier is a project-level setting available under the RagEngineConfig
resource and impacts RAG corpora using RagManagedDb
. To get the tier
configuration, use GetRagEngineConfig
. To update the tier configuration,
use UpdateRagEngineConfig
.
For more information on managing your tier configuration, see Manage tiers
.
Get project configuration
The following code samples demonstrate how to read your RagEngineConfig
:
Console
In the Google Cloud console, go to the RAG Engine
page. Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine
. The Configure RAG Engine
pane appears. You can see
the tier that's selected for your RAG Engine.
Click Cancel
.
Python
from
vertexai
import
rag
import
vertexai
PROJECT_ID
=
YOUR_PROJECT_ID
LOCATION
=
YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai
.
init
(
project
=
PROJECT_ID
,
location
=
LOCATION
)
rag_engine_config
=
rag
.
rag_data
.
get_rag_engine_config
(
name
=
f
"projects/
{
PROJECT_ID
}
/locations/
{
LOCATION
}
/ragEngineConfig"
)
print
(
rag_engine_config
)
REST
curl
-X
GET
\
-H
"Content-Type: application/json"
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
https:// ${
LOCATION
}
-aiplatform.googleapis.com/v1/projects/ ${
PROJECT_ID
}
/locations/ ${
LOCATION
}
/ragEngineConfig
Update project configuration
This section provides code samples to demonstrate how to change your
configuration to a Scaled, Basic, or Unprovisioned tier.
Update your RagEngineConfig
to the Scaled tier
The following code samples demonstrate how to set the RagEngineConfig
to the
Scaled tier:
Console
In the Google Cloud console, go to the RAG Engine
page. Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine
. The Configure RAG Engine
pane appears.
Select the tier that you want to run your RAG Engine.
Click Save
.
Python
from
vertexai
import
rag
import
vertexai
PROJECT_ID
=
YOUR_PROJECT_ID
LOCATION
=
YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai
.
init
(
project
=
PROJECT_ID
,
location
=
LOCATION
)
rag_engine_config_name
=
f
"projects/
{
PROJECT_ID
}
/locations/
{
LOCATION
}
/ragEngineConfig"
new_rag_engine_config
=
rag
.
RagEngineConfig
(
name
=
rag_engine_config_name
,
rag_managed_db_config
=
rag
.
RagManagedDbConfig
(
tier
=
rag
.
Scaled
()),
)
updated_rag_engine_config
=
rag
.
rag_data
.
update_rag_engine_config
(
rag_engine_config
=
new_rag_engine_config
)
print
(
updated_rag_engine_config
)
REST
curl
-X
PATCH
\
-H
"Content-Type: application/json"
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
https:// ${
LOCATION
}
-aiplatform.googleapis.com/v1/projects/ ${
PROJECT_ID
}
/locations/ ${
LOCATION
}
/ragEngineConfig
-d
"{'ragManagedDbConfig': {'scaled': {}}}"
Update your RagEngineConfig
to the Basic tier
The following code samples demonstrate how to set the RagEngineConfig
to the
Basic tier:
Note: If you have a large amount of data in your RagManagedDb
across your RAG
corpora, downgrading to a Basic tier can fail due to insufficient compute and
storage capacity.
Console
In the Google Cloud console, go to the RAG Engine
page. Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine
. The Configure RAG Engine
pane appears.
Select the tier that you want to run your RAG Engine.
Click Save
.
Python
from
vertexai
import
rag
import
vertexai
PROJECT_ID
=
YOUR_PROJECT_ID
LOCATION
=
YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai
.
init
(
project
=
PROJECT_ID
,
location
=
LOCATION
)
rag_engine_config_name
=
f
"projects/
{
PROJECT_ID
}
/locations/
{
LOCATION
}
/ragEngineConfig"
new_rag_engine_config
=
rag
.
RagEngineConfig
(
name
=
rag_engine_config_name
,
rag_managed_db_config
=
rag
.
RagManagedDbConfig
(
tier
=
rag
.
Basic
()),
)
updated_rag_engine_config
=
rag
.
rag_data
.
update_rag_engine_config
(
rag_engine_config
=
new_rag_engine_config
)
print
(
updated_rag_engine_config
)
REST
curl
-X
PATCH
\
-H
"Content-Type: application/json"
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
https:// ${
LOCATION
}
-aiplatform.googleapis.com/v1/projects/ ${
PROJECT_ID
}
/locations/ ${
LOCATION
}
/ragEngineConfig
-d
"{'ragManagedDbConfig': {'basic': {}}}"
Update your RagEngineConfig
to the Unprovisioned tier
The following code samples demonstrate how to set the RagEngineConfig
to the
Unprovisioned tier:
Console
In the Google Cloud console, go to the RAG Engine
page. Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine
. The Configure RAG Engine
pane appears.
Click Delete RAG Engine
. A confirmation dialog appears.
Verify that you're about to delete your data in RAG Engine by typing delete
, then
click Confirm
.
Click Save
.
Python
from
vertexai
import
rag
import
vertexai
PROJECT_ID
=
YOUR_PROJECT_ID
LOCATION
=
YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai
.
init
(
project
=
PROJECT_ID
,
location
=
LOCATION
)
rag_engine_config_name
=
f
"projects/
{
PROJECT_ID
}
/locations/
{
LOCATION
}
/ragEngineConfig"
new_rag_engine_config
=
rag
.
RagEngineConfig
(
name
=
rag_engine_config_name
,
rag_managed_db_config
=
rag
.
RagManagedDbConfig
(
tier
=
rag
.
Unprovisioned
()),
)
updated_rag_engine_config
=
rag
.
rag_data
.
update_rag_engine_config
(
rag_engine_config
=
new_rag_engine_config
)
print
(
updated_rag_engine_config
)
REST
curl
-X
PATCH
\
-H
"Content-Type: application/json"
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
https:// ${
LOCATION
}
-aiplatform.googleapis.com/v1/projects/ ${
PROJECT_ID
}
/locations/ ${
LOCATION
}
/ragEngineConfig
-d
"{'ragManagedDbConfig': {'unprovisioned': {}}}"
What's next
Send feedback
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License
, and code samples are licensed under the Apache 2.0 License
. For details, see the Google Developers Site Policies
. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-05-29 UTC.
Need to tell us more?
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-05-29 UTC."],[],[]]