RAG Engine API

The Vertex AI RAG Engine is a component of the Vertex AI platform, which facilitates Retrieval-Augmented Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access and incorporate data from external knowledge sources, such as documents and databases. By using RAG, LLMs can generate more accurate and informative LLM responses.

Parameters list

This section lists the following:

Parameters	Examples
See Corpus management parameters .	See Corpus management examples .
See File management parameters .	See File management examples .
See Project management parameters .	See Project management examples .

Corpus management parameters

For information about a RAG corpus, see Corpus management .

Create a RAG corpus

This table lists the parameters used to create a RAG corpus.

Body Request

Parameters

display_name

Required: string

The display name of the RAG corpus.

description

Optional: string

The description of the RAG corpus.

encryption_spec

Optional: Immutable: string

The CMEK key name is used to encrypt at-rest data that's related to the RAG corpus. The key name is only applicable to the RagManaged option for the vector database. When the corpus is created, this field can be set and can't be updated or deleted.

Format: projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{key_name}

vector_db_config

Optional: Immutable: vectorDbConfig

The configuration for the Vector DBs.

vertex_ai_search_config.serving_config

Optional: string

The configuration for the Vertex AI Search.

Format: projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{serving_config} or projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/servingConfigs/{serving_config}

`vectorDbConfig`

Parameters

rag_managed_db

oneof vector_db : vectorDbConfig.RagManagedDb

If no vector database is specified, rag_managed_db is the default vector database.

pinecone

oneof vector_db : vectorDbConfig.Pinecone

Specifies your Pinecone instance.

pinecone.index_name

string

This is the name used to create the Pinecone index that's used with the RAG corpus.

This value can't be changed after it's set. You can leave it empty in the CreateRagCorpus API call, and set it with a non-empty value in a follow up UpdateRagCorpus API call.

vertex_vector_search

oneof vector_db : vectorDbConfig.VertexVectorSearch

Specifies your Vertex Vector Search instance.

vertex_vector_search.index

string

This is the resource name of the Vector Search index that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}

This value can't be changed after it's set. You can leave it empty in the CreateRagCorpus API call, and set it with a non-empty value in a follow up UpdateRagCorpus API call.

vertex_vector_search.index_endpoint

string

This is the resource name of the Vector Search index endpoint that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexes/{index}

This value can't be changed after it's set. You can leave it empty in the CreateRagCorpus API call, and set it with a non-empty value in a follow up UpdateRagCorpus API call.

api_auth.api_key_config.api_key_secret_version

string

This the full resource name of the secret that is stored in Secret Manager, which contains your Pinecone API key.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

You can leave it empty in the CreateRagCorpus API call, and set it with a non-empty value in a follow up UpdateRagCorpus API call.

rag_embedding_model_config.vertex_prediction_endpoint.endpoint

Optional: Immutable: string

The embedding model to use for the RAG corpus. This value can't be changed after it's set. If you leave it empty, we use text-embedding-005 as the embedding model.

Update a RAG corpus

This table lists the parameters used to update a RAG corpus.

Body Request

Parameters

display_name

Optional: string

The display name of the RAG corpus.

description

Optional: string

The description of the RAG corpus.

rag_vector_db.pinecone.index_name

string

This is the name used to create the Pinecone index that's used with the RAG corpus.

If your RagCorpus was created with a Pinecone configuration, and this field has never been set before, then you can update the Pinecone instance's index name.

rag_vector_db.vertex_vector_search.index

string

This is the resource name of the Vector Search index that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}

If your RagCorpus was created with a Vector Search configuration, and this field has never been set before, then you can update it.

rag_vector_db.vertex_vector_search.index_endpoint

string

This is the resource name of the Vector Search index endpoint that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexes/{index}

If your RagCorpus was created with a Vector Search configuration, and this field has never been set before, then you can update it.

rag_vector_db.api_auth.api_key_config.api_key_secret_version

string

The full resource name of the secret that is stored in Secret Manager, which contains your Pinecone API key.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

List RAG corpora

This table lists the parameters used to list RAG corpora.

Parameters

page_size

Optional: int

The standard list page size.

page_token

Optional: string

The standard list page token. Typically obtained from [ListRagCorporaResponse.next_page_token][] of the previous [VertexRagDataService.ListRagCorpora][] call.

Get a RAG corpus

This table lists parameters used to get a RAG corpus.

Parameters

name

string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

Delete a RAG corpus

This table lists parameters used to delete a RAG corpus.

Parameters

name

string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

File management parameters

For information about a RAG file, see File management .

Upload a RAG file

This table lists parameters used to upload a RAG file.

Body Request

Parameters

parent

string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

rag_file

Required: RagFile

The file to upload.

upload_rag_file_config

Required: UploadRagFileConfig

The configuration for the RagFile to be uploaded into the RagCorpus .

RagFile

display_name

Required: string

The display name of the RAG file.

description

Optional: string

The description of the RAG file.

UploadRagFileConfig

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size

int32

Number of tokens each chunk has.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap

int32

The overlap between chunks.

Import RAG files

This table lists parameters used to import a RAG file.

Parameters

parent

Required: string

The name of the RagCorpus resource.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

gcs_source

oneof import_source : GcsSource

Cloud Storage location.

Supports importing individual files as well as entire Cloud Storage directories.

gcs_source.uris

list of string

Cloud Storage URI that contains the upload file.

google_drive_source

oneof import_source : GoogleDriveSource

Google Drive location.

Supports importing individual files as well as Google Drive folders.

slack_source

oneof import_source : SlackSource

The slack channel where the file is uploaded.

jira_source

oneof import_source : JiraSource

The Jira query where the file is uploaded.

share_point_sources

oneof import_source : SharePointSources

The SharePoint sources where the file is uploaded.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size

int32

Number of tokens each chunk has.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap

int32

The overlap between chunks.

rag_file_parsing_config

Optional: RagFileParsingConfig

Specifies the parsing configuration for RagFiles .

If this field isn't set, RAG uses the default parser.

max_embedding_requests_per_min

Optional: int32

The maximum number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value.

If unspecified, a default value of 1,000 QPM is used.

GoogleDriveSource

resource_ids.resource_id

Required: string

The ID of the Google Drive resource.

resource_ids.resource_type

Required: string

The type of the Google Drive resource.

SlackSource

channels.channels

Repeated: SlackSource.SlackChannels.SlackChannel

Slack channel information, include ID and time range to import.

channels.channels.channel_id

Required: string

The Slack channel ID.

channels.channels.start_time

Optional: google.protobuf.Timestamp

The starting timestamp for messages to import.

channels.channels.end_time

Optional: google.protobuf.Timestamp

The ending timestamp for messages to import.

channels.api_key_config.api_key_secret_version

Required: string

The full resource name of the secret that is stored in Secret Manager, which contains a Slack channel access token that has access to the slack channel IDs.See: https://api.slack.com/tutorials/tracks/getting-a-token.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

JiraSource

jira_queries.projects

Repeated: string

A list of Jira projects to import in their entirety.

jira_queries.custom_queries

Repeated: string

A list of custom Jira queries to import. For information about JQL (Jira Query Language), see Jira Support

jira_queries.email

Required: string

The Jira email address.

jira_queries.server_uri

Required: string

The Jira server URI.

jira_queries.api_key_config.api_key_secret_version

Required: string

The full resource name of the secret that is stored in Secret Manager, which contains Jira API key that has access to the slack channel IDs.See: https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

SharePointSources

share_point_sources.sharepoint_folder_path

oneof in folder_source : string

The path of the SharePoint folder to download from.

share_point_sources.sharepoint_folder_id

oneof in folder_source : string

The ID of the SharePoint folder to download from.

share_point_sources.drive_name

oneof in drive_source : string

The name of the drive to download from.

share_point_sources.drive_id

oneof in drive_source : string

The ID of the drive to download from.

share_point_sources.client_id

string

The Application ID for the app registered in Microsoft Azure Portal.The application must also be configured with MS Graph permissions "Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.

share_point_sources.client_secret.api_key_secret_version

Required: string

The full resource name of the secret that is stored in Secret Manager, which contains the application secret for the app registered in Azure.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

share_point_sources.tenant_id

string

Unique identifier of the Azure Active Directory Instance.

share_point_sources.sharepoint_site_name

string

The name of the SharePoint site to download from. This can be the site name or the site id.

RagFileParsingConfig

layout_parser

oneof parser : RagFileParsingConfig.LayoutParser

The Layout Parser to use for RagFile s.

layout_parser.processor_name

string

The full resource name of a Document AI processor or processor version.

Format:
projects/{project_id}/locations/{location}/processors/{processor_id}
projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}

layout_parser.max_parsing_requests_per_min

string

The maximum number of requests the job is allowed to make to the Document AI processor per minute.

Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM is used.

llm_parser

oneof parser : RagFileParsingConfig.LlmParser

The LLM parser to use for RagFile s.

llm_parser.model_name

string

The resource name of an LLM model.

Format: {publisher}/models/{model}

llm_parser.max_parsing_requests_per_min

string

The maximum number of requests the job is allowed to make to the LLM model per minute.

To set an appropriate value for your project, see model quota section and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 5000 QPM is used.

Get a RAG file

This table lists parameters used to get a RAG file.

Parameters

name

string

The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Delete a RAG file

This table lists parameters used to delete a RAG file.

Parameters

name

string

The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Retrieval and prediction parameters

This section lists the retrieval and prediction parameters.

Retrieval parameters

This table lists parameters for retrieveContexts API.

Parameters

parent

Required: string

The resource name of the Location to retrieve RagContexts .
The users must have permission to make a call in the project.

Format: projects/{project}/locations/{location}

vertex_rag_store

VertexRagStore

The data source for Vertex RagStore.

query

Required: RagQuery

Single RAG retrieve query.

`VertexRagStore`

VertexRagStore

rag_resources

list: RagResource

The representation of the RAG source. It can be used to specify the corpus only or RagFile s. Only support one corpus or multiple files from one corpus.

rag_resources.rag_corpus

Optional: string

RagCorpora resource name.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}

rag_resources.rag_file_ids

list: string

A list of RagFile resources.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}

RagQuery

text

string

The query in text format to get relevant contexts.

rag_retrieval_config

Optional: RagRetrievalConfig

The retrieval configuration for the query.

RagRetrievalConfig

top_k

Optional: int32

The number of contexts to retrieve.

filter.vector_distance_threshold

oneof vector_db_threshold : double

Only returns contexts with a vector distance smaller than the threshold.

filter.vector_similarity_threshold

oneof vector_db_threshold : double

Only returns contexts with vector similarity larger than the threshold.

ranking.rank_service.model_name

Optional: string

The model name of the rank service.

Example: semantic-ranker-512@latest

ranking.llm_ranker.model_name

Optional: string

The model name used for ranking.

Example: gemini-2.5-flash

Prediction parameters

This table lists prediction parameters.

GenerateContentRequest

tools.retrieval.vertex_rag_store

VertexRagStore

Set to use a data source powered by Vertex AI RAG store.

See VertexRagStore for details.

Project management parameters

This table lists project-level parameters.

`RagEngineConfig`

Parameters
`RagManagedDbConfig.scaled`	This tier offers production-scale performance along with auto scaling functionality.
`RagManagedDbConfig.basic`	This tier offers a cost-effective and low-compute tier.
`RagManagedDbConfig.unprovisioned`	This tier deletes the `RagManagedDb` and its underlying Spanner instance.

Corpus management examples

This section provides examples of how to use the API to manage your RAG corpus.

Create a RAG corpus example

These code samples demonstrate how to create a RAG corpus.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID : Your project ID.
LOCATION : The region to process the request.
CORPUS_DISPLAY_NAME : The display name of the RAG corpus.
CORPUS_DESCRIPTION : The description of the RAG corpus.

HTTP method and URL:

 POST  
https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/ragCorpora

Request JSON body:

  { 
  
 "display_name" 
  
 : 
  
 " CORPUS_DISPLAY_NAME 
" 
 , 
  
 "description" 
 : 
  
 " CORPUS_DESCRIPTION 
" 
 , 
 }

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and run the following command:

 curl  
-X  
POST  
 \ 
  
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
  
-H  
 "Content-Type: application/json; charset=utf-8" 
  
 \ 
  
-d  
@request.json  
 \ 
  
 "https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/ragCorpora"

Powershell

Save the request body in a file named request.json, and run the following command:

   
 $cred 
  
 = 
  
gcloud  
auth  
print-access-token  
 $headers 
  
 = 
  
@ { 
  
 "Authorization" 
  
 = 
  
 "Bearer 
 $cred 
 " 
  
 } 
  
Invoke-WebRequest  
 ` 
  
-Method  
POST  
 ` 
  
-Headers  
 $headers 
  
 ` 
  
-ContentType:  
 "application/json; charset=utf-8" 
  
 ` 
  
-InFile  
request.json  
 ` 
  
-Uri  
 "https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/ragCorpora" 
  
 | 
  
Select-Object  
-Expand  
Content

You should receive a successful status code (2xx).

The following example demonstrates how to create a RAG corpus by using the REST API.

   
//  
CreateRagCorpus  
//  
Input:  
LOCATION,  
PROJECT_ID,  
CORPUS_DISPLAY_NAME  
//  
Output:  
CreateRagCorpusOperationMetadata  
curl  
-X  
POST  
 \ 
  
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
  
-H  
 "Content-Type: application/json" 
  
 \ 
  
https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/ragCorpora  
 \ 
  
-d  
 '{ 
 "display_name" : " CORPUS_DISPLAY_NAME 
" 
 }'

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  from 
  
 vertexai 
  
 import 
 rag 
 import 
  
 vertexai 
 # TODO(developer): Update and un-comment below lines 
 # PROJECT_ID = "your-project-id" 
 # display_name = "test_corpus" 
 # description = "Corpus Description" 
 # Initialize Vertex AI API once per session 
 vertexai 
 . 
 init 
 ( 
 project 
 = 
 PROJECT_ID 
 , 
 location 
 = 
 "us-central1" 
 ) 
 # Configure backend_config 
 backend_config 
 = 
 rag 
 . 
 RagVectorDbConfig 
 ( 
 rag_embedding_model_config 
 = 
 rag 
 . 
 RagEmbeddingModelConfig 
 ( 
 vertex_prediction_endpoint 
 = 
 rag 
 . 
 VertexPredictionEndpoint 
 ( 
 publisher_model 
 = 
 "publishers/google/models/text-embedding-005" 
 ) 
 ) 
 ) 
 corpus 
 = 
 rag 
 . 
 create_corpus 
 ( 
 display_name 
 = 
 display_name 
 , 
 description 
 = 
 description 
 , 
 backend_config 
 = 
 backend_config 
 , 
 ) 
 print 
 ( 
 corpus 
 ) 
 # Example response: 
 # RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890', 
 # display_name='test_corpus', description='Corpus Description', embedding_model_config=... 
 # ...

RAG Engine API Stay organized with collections Save and categorize content based on your preferences.

Parameters list

Corpus management parameters

Create a RAG corpus

Body Request

vectorDbConfig

Update a RAG corpus

Body Request

List RAG corpora

Get a RAG corpus

Delete a RAG corpus

File management parameters

Upload a RAG file

Body Request

Import RAG files

Get a RAG file

Delete a RAG file

Retrieval and prediction parameters

Retrieval parameters

VertexRagStore

Prediction parameters

Project management parameters

RagEngineConfig

Corpus management examples

Create a RAG corpus example

REST

curl

Powershell

Python

Update a RAG corpus example

REST

curl

Powershell

List RAG corpora example

REST

curl

Powershell

Python

Get a RAG corpus example

REST

curl

Powershell

Python

Delete a RAG corpus example

REST

curl

Powershell

Python

File management examples

Upload a RAG file example

REST

Python

Import RAG files example

Python

REST

curl

Powershell

List RAG files example

REST

curl

Powershell

Python

Get a RAG file example

REST

curl

Powershell

Python

Delete a RAG file example

REST

curl

Powershell

Python

Retrieval query example

Python

REST

curl

Powershell

Generation example

REST

curl

RAG Engine API

`vectorDbConfig`

`VertexRagStore`

`RagEngineConfig`

Update your `RagEngineConfig` to the Scaled tier

Update your `RagEngineConfig` to the Basic tier

Update your `RagEngineConfig` to the Unprovisioned tier