Deploy and manage index endpoints in a VPC network

Deploying an index to an endpoint includes the following three tasks:

  1. Create an IndexEndpoint if needed, or reuse an existing IndexEndpoint .
  2. Get the IndexEndpoint ID.
  3. Deploy the index to the IndexEndpoint .

Create an IndexEndpoint within your VPC network

If you are deploying an Index to an existing IndexEndpoint , you can skip this step.

Before you use an index to serve online vector matching queries, you must deploy the Index to an IndexEndpoint within your VPC Network Peering network . The first step is to create an IndexEndpoint . You can deploy more than one index to an IndexEndpoint that shares the same VPC network.

gcloud

The following example uses the gcloud ai index-endpoints create command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_NAME : Display name of the index endpoint.
  • VPC_NETWORK_NAME : The Google Compute Engine network name to which the index endpoint should be peered.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud  
ai  
index-endpoints  
create  
 \ 
  
--display-name = 
 INDEX_ENDPOINT_NAME 
  
 \ 
  
--network = 
 VPC_NETWORK_NAME 
  
 \ 
  
--region = 
 LOCATION 
  
 \ 
  
--project = 
 PROJECT_ID 

Windows (PowerShell)

gcloud  
ai  
index-endpoints  
create  
 ` 
  
--display-name = 
 INDEX_ENDPOINT_NAME 
  
 ` 
  
--network = 
 VPC_NETWORK_NAME 
  
 ` 
  
--region = 
 LOCATION 
  
 ` 
  
--project = 
 PROJECT_ID 

Windows (cmd.exe)

gcloud  
ai  
index-endpoints  
create  
^  
--display-name = 
 INDEX_ENDPOINT_NAME 
  
^  
--network = 
 VPC_NETWORK_NAME 
  
^  
--region = 
 LOCATION 
  
^  
--project = 
 PROJECT_ID 

You should receive a response similar to the following:

The Google Cloud CLI tool might take a few minutes to create the IndexEndpoint 
.

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_NAME : Display name of the index endpoint.
  • VPC_NETWORK_NAME : The Google Compute Engine network name to which the index endpoint should be peered.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .
  • PROJECT_NUMBER : Your project's automatically generated project number .

HTTP method and URL:

POST https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/indexEndpoints

Request JSON body:

{
  "display_name": " INDEX_ENDPOINT_NAME 
",
  "network": " VPC_NETWORK_NAME 
"
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
/operations/ OPERATION_ID 
",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2022-01-13T04:09:56.641107Z",
      "updateTime": "2022-01-13T04:09:56.641107Z"
    }
  }
}

You can poll for the status of the operation until the response includes "done": true .

Terraform

The following sample uses the vertex_ai_index_endpoint Terraform resource to create an index endpoint.

To learn how to apply or remove a Terraform configuration, see Basic Terraform commands .

  resource 
  
 "google_vertex_ai_index_endpoint" 
  
 "default" 
  
 { 
  
 display_name 
  
 = 
  
 "sample-endpoint" 
  
 description 
  
 = 
  
 "A sample index endpoint within a VPC network" 
  
 region 
  
 = 
  
 "us-central1" 
  
 network 
  
 = 
  
 "projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}" 
  
 depends_on 
  
 = 
  
 [ 
  
 google_service_networking_connection.default 
  
 ] 
 } 
 resource 
  
 "google_service_networking_connection" 
  
 "default" 
  
 { 
  
 network 
  
 = 
  
 google_compute_network.default.id 
  
 service 
  
 = 
  
 "servicenetworking.googleapis.com" 
  
 reserved_peering_ranges 
  
 = 
  
 [ 
 google_compute_global_address.default.name 
 ] 
 # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729 
  
 deletion_policy 
  
 = 
  
 "ABANDON" 
 } 
 resource 
  
 "google_compute_global_address" 
  
 "default" 
  
 { 
  
 name 
  
 = 
  
 "sample-address" 
  
 purpose 
  
 = 
  
 "VPC_PEERING" 
  
 address_type 
  
 = 
  
 "INTERNAL" 
  
 prefix_length 
  
 = 
  
 16 
  
 network 
  
 = 
  
 google_compute_network.default.id 
 } 
 resource 
  
 "google_compute_network" 
  
 "default" 
  
 { 
  
 name 
  
 = 
  
 "sample-network" 
 } 
 data 
  
 "google_project" 
  
 "project" 
  
 {} 
 # Cloud Storage bucket name must be unique 
 resource 
  
 "random_id" 
  
 "default" 
  
 { 
  
 byte_length 
  
 = 
  
 8 
 } 
 # Create a Cloud Storage bucket 
 resource 
  
 "google_storage_bucket" 
  
 "bucket" 
  
 { 
  
 name 
  
 = 
  
 "vertex-ai-index-bucket-${random_id.default.hex}" 
  
 location 
  
 = 
  
 "us-central1" 
  
 uniform_bucket_level_access 
  
 = 
  
 true 
 } 
 # Create index content 
 resource 
  
 "google_storage_bucket_object" 
  
 "data" 
  
 { 
  
 name 
  
 = 
  
 "contents/data.json" 
  
 bucket 
  
 = 
  
 google_storage_bucket.bucket.name 
  
 content 
  
 = 
  
<< EOF 
 { 
 "id" 
 : 
  
 "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline" 
 ]}]} 
 { 
 "id" 
 : 
  
 "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine" 
 ]}]} 
 EOF 
 } 
 resource 
  
 "google_vertex_ai_index" 
  
 "default" 
  
 { 
  
 region 
  
 = 
  
 "us-central1" 
  
 display_name 
  
 = 
  
 "sample-index-batch-update" 
  
 description 
  
 = 
  
 "A sample index for batch update" 
  
 labels 
  
 = 
  
 { 
  
 foo 
  
 = 
  
 "bar" 
  
 } 
  
 metadata 
  
 { 
  
 contents_delta_uri 
  
 = 
  
 "gs://${google_storage_bucket.bucket.name}/contents" 
  
 config 
  
 { 
  
 dimensions 
  
 = 
  
 2 
  
 approximate_neighbors_count 
  
 = 
  
 150 
  
 distance_measure_type 
  
 = 
  
 "DOT_PRODUCT_DISTANCE" 
  
 algorithm_config 
  
 { 
  
 tree_ah_config 
  
 { 
  
 leaf_node_embedding_count 
  
 = 
  
 500 
  
 leaf_nodes_to_search_percent 
  
 = 
  
 7 
  
 } 
  
 } 
  
 } 
  
 } 
  
 index_update_method 
  
 = 
  
 "BATCH_UPDATE" 
  
 timeouts 
  
 { 
  
 create 
  
 = 
  
 "2h" 
  
 update 
  
 = 
  
 "1h" 
  
 } 
 } 
 

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  def 
  
 vector_search_create_index_endpoint_vpc 
 ( 
  
 project: 
  
 str 
 , 
  
 location: 
  
 str 
 , 
  
 display_name: 
  
 str 
 , 
  
 network: 
  
 str 
 ) 
  
 - 
>  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint: 
  
 """Create a vector search index endpoint within a VPC network. 
  
 Args: 
  
 project 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Project 
  
 ID 
  
 location 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 region 
  
 name 
  
 display_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 index 
  
 endpoint 
  
 display 
  
 name 
  
 network 
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 VPC 
  
 network 
  
 name 
 , 
  
 in 
  
 the 
  
 format 
  
 of 
  
 projects 
 / 
 { 
 project 
  
 number 
 } 
 / 
 global 
 / 
 networks 
 / 
 { 
 network 
  
 name 
 }. 
  
 Returns: 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
  
 - 
  
 The 
  
 created 
  
 index 
  
 endpoint 
 . 
  
 """ 
  
 # 
  
 Initialize 
  
 the 
  
 Vertex 
  
 AI 
  
 client 
  
 aiplatform 
 . 
 init 
 ( 
 project 
 = 
 project 
 , 
  
 location 
 = 
 location 
 ) 
  
 # 
  
 Create 
  
 Index 
  
 Endpoint 
  
 index_endpoint 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 . 
 create 
 ( 
  
 display_name 
 = 
 display_name 
 , 
  
 network 
 = 
 network 
 , 
  
 description 
 = 
 "Matching Engine VPC Index Endpoint" 
 , 
  
 ) 
  
 return 
  
 index_endpoint 
 

Console

Use these instructions to create an index endpoint.

  1. In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. On the top of the page, select the Index endpoints tab. Your index endpoints are displayed.
  4. Click Create new index endpoint . The Create a new index endpoint panel opens.
  5. Enter a display name for the index endpoint.
  6. In the Region field, select a region from the drop-down.
  7. In the Access field, select Private .
  8. Enter your peered VPC network details. Enter the full name of the Compute Engine network to which the job should be peered. The format should be projects/{project_num}/global/networks/{network_id}
  9. Click Create .

Deploy an index

gcloud

This example uses the gcloud ai index-endpoints deploy-index command .

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_ENDPOINT_NAME : Display name of the deployed index endpoint.
  • INDEX_ID : The ID of the index.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud  
ai  
index-endpoints  
deploy-index  
 INDEX_ENDPOINT_ID 
  
 \ 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 \ 
  
--display-name = 
 DEPLOYED_INDEX_ENDPOINT_NAME 
  
 \ 
  
--index = 
 INDEX_ID 
  
 \ 
  
--region = 
 LOCATION 
  
 \ 
  
--project = 
 PROJECT_ID 

Windows (PowerShell)

gcloud  
ai  
index-endpoints  
deploy-index  
 INDEX_ENDPOINT_ID 
  
 ` 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 ` 
  
--display-name = 
 DEPLOYED_INDEX_ENDPOINT_NAME 
  
 ` 
  
--index = 
 INDEX_ID 
  
 ` 
  
--region = 
 LOCATION 
  
 ` 
  
--project = 
 PROJECT_ID 

Windows (cmd.exe)

gcloud  
ai  
index-endpoints  
deploy-index  
 INDEX_ENDPOINT_ID 
  
^  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
^  
--display-name = 
 DEPLOYED_INDEX_ENDPOINT_NAME 
  
^  
--index = 
 INDEX_ID 
  
^  
--region = 
 LOCATION 
  
^  
--project = 
 PROJECT_ID 

You should receive a response similar to the following:

The Google Cloud CLI tool might take a few minutes to create the IndexEndpoint 
.

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_ENDPOINT_NAME : Display name of the deployed index endpoint.
  • INDEX_ID : The ID of the index.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .
  • PROJECT_NUMBER : Your project's automatically generated project number .

HTTP method and URL:

POST https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
:deployIndex

Request JSON body:

{
 "deployedIndex": {
   "id": " DEPLOYED_INDEX_ID 
",
   "index": "projects/ PROJECT_ID 
/locations/ LOCATION 
/indexes/ INDEX_ID 
",
   "displayName": " DEPLOYED_INDEX_ENDPOINT_NAME 
"
 }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
 "name": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
/operations/ OPERATION_ID 
",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",
   "genericMetadata": {
     "createTime": "2022-10-19T17:53:16.502088Z",
     "updateTime": "2022-10-19T17:53:16.502088Z"
   },
   "deployedIndexId": " DEPLOYED_INDEX_ID 
"
 }
}

Terraform

The following sample uses the vertex_ai_index_endpoint_deployed_index Terraform resource to create a deployed index endpoint.

To learn how to apply or remove a Terraform configuration, see Basic Terraform commands .

  provider 
  
 "google" 
  
 { 
  
 region 
  
 = 
  
 "us-central1" 
 } 
 resource 
  
 "google_vertex_ai_index_endpoint_deployed_index" 
  
 "default" 
  
 { 
  
 depends_on 
  
 = 
  
 [ 
 google_vertex_ai_index_endpoint.default 
 ] 
  
 index_endpoint 
  
 = 
  
 google_vertex_ai_index_endpoint.default.id 
  
 index 
  
 = 
  
 google_vertex_ai_index.default.id 
  
 deployed_index_id 
  
 = 
  
 "deployed_index_for_vpc" 
 } 
 resource 
  
 "google_vertex_ai_index_endpoint" 
  
 "default" 
  
 { 
  
 display_name 
  
 = 
  
 "sample-endpoint" 
  
 description 
  
 = 
  
 "A sample index endpoint within a VPC network" 
  
 region 
  
 = 
  
 "us-central1" 
  
 network 
  
 = 
  
 "projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}" 
  
 depends_on 
  
 = 
  
 [ 
  
 google_service_networking_connection.default 
  
 ] 
 } 
 resource 
  
 "google_service_networking_connection" 
  
 "default" 
  
 { 
  
 network 
  
 = 
  
 google_compute_network.default.id 
  
 service 
  
 = 
  
 "servicenetworking.googleapis.com" 
  
 reserved_peering_ranges 
  
 = 
  
 [ 
 google_compute_global_address.default.name 
 ] 
 # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729 
  
 deletion_policy 
  
 = 
  
 "ABANDON" 
 } 
 resource 
  
 "google_compute_global_address" 
  
 "default" 
  
 { 
  
 name 
  
 = 
  
 "sample-address" 
  
 purpose 
  
 = 
  
 "VPC_PEERING" 
  
 address_type 
  
 = 
  
 "INTERNAL" 
  
 prefix_length 
  
 = 
  
 16 
  
 network 
  
 = 
  
 google_compute_network.default.id 
 } 
 resource 
  
 "google_compute_network" 
  
 "default" 
  
 { 
  
 name 
  
 = 
  
 "sample-network" 
 } 
 data 
  
 "google_project" 
  
 "project" 
  
 {} 
 # Cloud Storage bucket name must be unique 
 resource 
  
 "random_id" 
  
 "default" 
  
 { 
  
 byte_length 
  
 = 
  
 8 
 } 
 # Create a Cloud Storage bucket 
 resource 
  
 "google_storage_bucket" 
  
 "bucket" 
  
 { 
  
 name 
  
 = 
  
 "vertex-ai-index-bucket-${random_id.default.hex}" 
  
 location 
  
 = 
  
 "us-central1" 
  
 uniform_bucket_level_access 
  
 = 
  
 true 
 } 
 # Create index content 
 resource 
  
 "google_storage_bucket_object" 
  
 "data" 
  
 { 
  
 name 
  
 = 
  
 "contents/data.json" 
  
 bucket 
  
 = 
  
 google_storage_bucket.bucket.name 
  
 content 
  
 = 
  
<< EOF 
 { 
 "id" 
 : 
  
 "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline" 
 ]}]} 
 { 
 "id" 
 : 
  
 "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine" 
 ]}]} 
 EOF 
 } 
 resource 
  
 "google_vertex_ai_index" 
  
 "default" 
  
 { 
  
 region 
  
 = 
  
 "us-central1" 
  
 display_name 
  
 = 
  
 "sample-index-batch-update" 
  
 description 
  
 = 
  
 "A sample index for batch update" 
  
 labels 
  
 = 
  
 { 
  
 foo 
  
 = 
  
 "bar" 
  
 } 
  
 metadata 
  
 { 
  
 contents_delta_uri 
  
 = 
  
 "gs://${google_storage_bucket.bucket.name}/contents" 
  
 config 
  
 { 
  
 dimensions 
  
 = 
  
 2 
  
 approximate_neighbors_count 
  
 = 
  
 150 
  
 distance_measure_type 
  
 = 
  
 "DOT_PRODUCT_DISTANCE" 
  
 algorithm_config 
  
 { 
  
 tree_ah_config 
  
 { 
  
 leaf_node_embedding_count 
  
 = 
  
 500 
  
 leaf_nodes_to_search_percent 
  
 = 
  
 7 
  
 } 
  
 } 
  
 } 
  
 } 
  
 index_update_method 
  
 = 
  
 "BATCH_UPDATE" 
  
 timeouts 
  
 { 
  
 create 
  
 = 
  
 "2h" 
  
 update 
  
 = 
  
 "1h" 
  
 } 
 } 
 

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  def 
  
 vector_search_deploy_index 
 ( 
  
 project: 
  
 str 
 , 
  
 location: 
  
 str 
 , 
  
 index_name: 
  
 str 
 , 
  
 index_endpoint_name: 
  
 str 
 , 
  
 deployed_index_id: 
  
 str 
 , 
 ) 
  
 - 
>  
 None: 
  
 """Deploy a vector search index to a vector search index endpoint. 
  
 Args: 
  
 project 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Project 
  
 ID 
  
 location 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 region 
  
 name 
  
 index_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 index 
  
 to 
  
 update 
 . 
  
 A 
  
 fully 
 - 
 qualified 
  
 index 
  
 resource 
  
 name 
  
 or 
  
 a 
  
 index 
  
 ID 
 . 
  
 Example: 
  
 "projects/123/locations/us-central1/indexes/my_index_id" 
  
 or 
  
 "my_index_id" 
 . 
  
 index_endpoint_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Index 
  
 endpoint 
  
 to 
  
 deploy 
  
 the 
  
 index 
  
 to 
 . 
  
 deployed_index_id 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 user 
  
 specified 
  
 ID 
  
 of 
  
 the 
  
 DeployedIndex 
 . 
  
 """ 
  
 # 
  
 Initialize 
  
 the 
  
 Vertex 
  
 AI 
  
 client 
  
 aiplatform 
 . 
 init 
 ( 
 project 
 = 
 project 
 , 
  
 location 
 = 
 location 
 ) 
  
 # 
  
 Create 
  
 the 
  
 index 
  
 instance 
  
 from 
  
 an 
  
 existing 
  
 index 
  
 index 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndex 
 ( 
 index_name 
 = 
 index_name 
 ) 
  
 # 
  
 Create 
  
 the 
  
 index 
  
 endpoint 
  
 instance 
  
 from 
  
 an 
  
 existing 
  
 endpoint 
 . 
  
 index_endpoint 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 ( 
  
 index_endpoint_name 
 = 
 index_endpoint_name 
  
 ) 
  
 # 
  
 Deploy 
  
 Index 
  
 to 
  
 Endpoint 
  
 index_endpoint 
  
 = 
  
 index_endpoint 
 . 
 deploy_index 
 ( 
  
 index 
 = 
 index 
 , 
  
 deployed_index_id 
 = 
 deployed_index_id 
  
 ) 
  
 print 
 ( 
 index_endpoint 
 . 
 deployed_indexes 
 ) 
 

Console

Use these instructions to deploy your index to an endpoint.

  1. In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. Select the name of the index you want to deploy. The index details page opens.
  4. From the index details page, click Deploy to endpoint . The index deployment panel opens.
  5. Enter a display name - this name acts as an ID and can't be updated.
  6. From the Endpoint drop-down, select the endpoint you want to deploy this index to. Note: The endpoint is unavailable if the index is already deployed to it.
  7. Optional: In the Machine type field, select either standard or high-memory.
  8. Optional. Select Enable autoscaling to automatically resize the number of nodes based on the demands of your workloads. The default number of replicas is 2 if autoscaling is disabled.
  9. Click Deploy to deploy your index to the endpoint. Note: It takes around 30 minutes to be deployed.

Enable autoscaling

Vector Search supports autoscaling, which can automatically resize the number of nodes based on the demands of your workloads. When demand is high, nodes are added to the node pool, which won't exceed the maximum size you designate. When demand is low, the node pool scales back down to a minimum size that you designate. You can check the actual nodes in use and the changes by monitoring the current replicas .

To enable autoscaling, specify the maxReplicaCount and minReplicaCount when you deploy your index:

gcloud

The following example uses the gcloud ai index-endpoints deploy-index command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_NAME : Display name of the deployed index.
  • INDEX_ID : The ID of the index.
  • MIN_REPLICA_COUNT : Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT : Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud  
ai  
index-endpoints  
deploy-index  
 INDEX_ENDPOINT_ID 
  
 \ 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 \ 
  
--display-name = 
 DEPLOYED_INDEX_NAME 
  
 \ 
  
--index = 
 INDEX_ID 
  
 \ 
  
--min-replica-count = 
 MIN_REPLICA_COUNT 
  
 \ 
  
--max-replica-count = 
 MAX_REPLICA_COUNT 
  
 \ 
  
--region = 
 LOCATION 
  
 \ 
  
--project = 
 PROJECT_ID 

Windows (PowerShell)

gcloud  
ai  
index-endpoints  
deploy-index  
 INDEX_ENDPOINT_ID 
  
 ` 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 ` 
  
--display-name = 
 DEPLOYED_INDEX_NAME 
  
 ` 
  
--index = 
 INDEX_ID 
  
 ` 
  
--min-replica-count = 
 MIN_REPLICA_COUNT 
  
 ` 
  
--max-replica-count = 
 MAX_REPLICA_COUNT 
  
 ` 
  
--region = 
 LOCATION 
  
 ` 
  
--project = 
 PROJECT_ID 

Windows (cmd.exe)

gcloud  
ai  
index-endpoints  
deploy-index  
 INDEX_ENDPOINT_ID 
  
^  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
^  
--display-name = 
 DEPLOYED_INDEX_NAME 
  
^  
--index = 
 INDEX_ID 
  
^  
--min-replica-count = 
 MIN_REPLICA_COUNT 
  
^  
--max-replica-count = 
 MAX_REPLICA_COUNT 
  
^  
--region = 
 LOCATION 
  
^  
--project = 
 PROJECT_ID 

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • DEPLOYED_INDEX_NAME : Display name of the deployed index.
  • INDEX_ID : The ID of the index.
  • MIN_REPLICA_COUNT : Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT : Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .
  • PROJECT_NUMBER : Your project's automatically generated project number .

HTTP method and URL:

POST https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
:deployIndex

Request JSON body:

{
 "deployedIndex": {
   "id": " DEPLOYED_INDEX_ID 
",
   "index": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexes/ INDEX_ID 
",
   "displayName": " DEPLOYED_INDEX_NAME 
",
   "automaticResources": {
     "minReplicaCount": MIN_REPLICA_COUNT 
,
     "maxReplicaCount": MAX_REPLICA_COUNT 
}
 }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
 "name": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
/operations/ OPERATION_ID 
",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",
   "genericMetadata": {
     "createTime": "2023-10-19T17:53:16.502088Z",
     "updateTime": "2023-10-19T17:53:16.502088Z"
   },
   "deployedIndexId": " DEPLOYED_INDEX_ID 
"
 }
}

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  def 
  
 vector_search_deploy_autoscaling_index 
 ( 
  
 project: 
  
 str 
 , 
  
 location: 
  
 str 
 , 
  
 index_name: 
  
 str 
 , 
  
 index_endpoint_name: 
  
 str 
 , 
  
 deployed_index_id: 
  
 str 
 , 
  
 min_replica_count: 
  
 int 
 , 
  
 max_replica_count: 
  
 int 
 , 
 ) 
  
 - 
>  
 None: 
  
 """Deploy a vector search index to a vector search index endpoint. 
  
 Args: 
  
 project 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Project 
  
 ID 
  
 location 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 region 
  
 name 
  
 index_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 index 
  
 to 
  
 update 
 . 
  
 A 
  
 fully 
 - 
 qualified 
  
 index 
  
 resource 
  
 name 
  
 or 
  
 a 
  
 index 
  
 ID 
 . 
  
 Example: 
  
 "projects/123/locations/us-central1/indexes/my_index_id" 
  
 or 
  
 "my_index_id" 
 . 
  
 index_endpoint_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Index 
  
 endpoint 
  
 to 
  
 deploy 
  
 the 
  
 index 
  
 to 
 . 
  
 deployed_index_id 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 user 
  
 specified 
  
 ID 
  
 of 
  
 the 
  
 DeployedIndex 
 . 
  
 min_replica_count 
  
 ( 
 int 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 minimum 
  
 number 
  
 of 
  
 replicas 
  
 to 
  
 deploy 
 . 
  
 max_replica_count 
  
 ( 
 int 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 maximum 
  
 number 
  
 of 
  
 replicas 
  
 to 
  
 deploy 
 . 
  
 """ 
  
 # 
  
 Initialize 
  
 the 
  
 Vertex 
  
 AI 
  
 client 
  
 aiplatform 
 . 
 init 
 ( 
 project 
 = 
 project 
 , 
  
 location 
 = 
 location 
 ) 
  
 # 
  
 Create 
  
 the 
  
 index 
  
 instance 
  
 from 
  
 an 
  
 existing 
  
 index 
  
 index 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndex 
 ( 
 index_name 
 = 
 index_name 
 ) 
  
 # 
  
 Create 
  
 the 
  
 index 
  
 endpoint 
  
 instance 
  
 from 
  
 an 
  
 existing 
  
 endpoint 
 . 
  
 index_endpoint 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 ( 
  
 index_endpoint_name 
 = 
 index_endpoint_name 
  
 ) 
  
 # 
  
 Deploy 
  
 Index 
  
 to 
  
 Endpoint 
 . 
  
 Specifying 
  
 min 
  
 and 
  
 max 
  
 replica 
  
 counts 
  
 will 
  
 # 
  
 enable 
  
 autoscaling 
 . 
  
 index_endpoint 
 . 
 deploy_index 
 ( 
  
 index 
 = 
 index 
 , 
  
 deployed_index_id 
 = 
 deployed_index_id 
 , 
  
 min_replica_count 
 = 
 min_replica_count 
 , 
  
 max_replica_count 
 = 
 max_replica_count 
 , 
  
 ) 
 

Console

You can only enable autoscaling from the console during index deployment.

  1. In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. Select the name of the index you want to deploy. The index details page opens.
  4. From the index details page, click Deploy to endpoint . The index deployment panel opens.
  5. Enter a display name - this name acts as an ID and can't be updated.
  6. From the Endpoint drop-down, select the endpoint you want to deploy this index to. Note: The endpoint is unavailable if the index is already deployed to it.
  7. Optional: In the Machine type field, select either standard or high-memory.
  8. Optional. Select Enable autoscaling to automatically resize the number of nodes based on the demands of your workloads. The default number of replicas is 2 if autoscaling is disabled.
  • If both minReplicaCount and maxReplicaCount are not set, they are set to 2 by default.
  • If only maxReplicaCount is set, minReplicaCount is set to 2 by default.
  • If only minReplicaCount is set, maxReplicaCount is set to equal minReplicaCount .

Mutate a DeployedIndex

You can use MutateDeployedIndex API to update the deployment resources (for example, minReplicaCount and maxReplicaCount ) of an already deployed index.

  • Users are not allowed to change the machineType after the index is deployed.
  • If maxReplicaCount is not specified in the request, the DeployedIndex will keep using the existing maxReplicaCount .

gcloud

The following example uses the gcloud ai index-endpoints mutate-deployed-index command .

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • MIN_REPLICA_COUNT : Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT : Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud  
ai  
index-endpoints  
mutate-deployed-index  
 INDEX_ENDPOINT_ID 
  
 \ 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 \ 
  
--min-replica-count = 
 MIN_REPLICA_COUNT 
  
 \ 
  
--max-replica-count = 
 MAX_REPLICA_COUNT 
  
 \ 
  
--region = 
 LOCATION 
  
 \ 
  
--project = 
 PROJECT_ID 

Windows (PowerShell)

gcloud  
ai  
index-endpoints  
mutate-deployed-index  
 INDEX_ENDPOINT_ID 
  
 ` 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 ` 
  
--min-replica-count = 
 MIN_REPLICA_COUNT 
  
 ` 
  
--max-replica-count = 
 MAX_REPLICA_COUNT 
  
 ` 
  
--region = 
 LOCATION 
  
 ` 
  
--project = 
 PROJECT_ID 

Windows (cmd.exe)

gcloud  
ai  
index-endpoints  
mutate-deployed-index  
 INDEX_ENDPOINT_ID 
  
^  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
^  
--min-replica-count = 
 MIN_REPLICA_COUNT 
  
^  
--max-replica-count = 
 MAX_REPLICA_COUNT 
  
^  
--region = 
 LOCATION 
  
^  
--project = 
 PROJECT_ID 

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • MIN_REPLICA_COUNT : Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
  • MAX_REPLICA_COUNT : Maximum number of machine replicas the deployed index could be deployed on.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .
  • PROJECT_NUMBER : Your project's automatically generated project number .

HTTP method and URL:

POST https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
:mutateDeployedIndex

Request JSON body:

{
  "deployedIndex": {
    "id": " DEPLOYED_INDEX_ID 
",
    "index": "projects/ PROJECT_ID 
/locations/ LOCATION 
/indexes/ INDEX_ID 
",
    "displayName": " DEPLOYED_INDEX_NAME 
",
    "min_replica_count": " MIN_REPLICA_COUNT 
",
    "max_replica_count": " MAX_REPLICA_COUNT 
"
  }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
"name": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
/operations/ OPERATION_ID 
",
"metadata": {
  "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",
  "genericMetadata": {
    "createTime": "2020-10-19T17:53:16.502088Z",
    "updateTime": "2020-10-19T17:53:16.502088Z"
  },
  "deployedIndexId": " DEPLOYED_INDEX_ID 
"
}
}

Terraform

To learn how to apply or remove a Terraform configuration, see Basic Terraform commands . For more information, see the Terraform provider reference documentation .

  provider 
  
 "google" 
  
 { 
  
 region 
  
 = 
  
 "us-central1" 
 } 
 resource 
  
 "google_vertex_ai_index_endpoint_deployed_index" 
  
 "default" 
  
 { 
  
 depends_on 
  
 = 
  
 [ 
 google_vertex_ai_index_endpoint.default 
 ] 
  
 index_endpoint 
  
 = 
  
 google_vertex_ai_index_endpoint.default.id 
  
 index 
  
 = 
  
 google_vertex_ai_index.default.id 
  
 deployed_index_id 
  
 = 
  
 "deployed_index_for_mutate_vpc" 
 # This example assumes the deployed index endpoint's resources configuration 
 # differs from the values specified below. Terraform will mutate the deployed 
 # index endpoint's resource configuration to match. 
  
 automatic_resources 
  
 { 
  
 min_replica_count 
  
 = 
  
 3 
  
 max_replica_count 
  
 = 
  
 5 
  
 } 
 } 
 resource 
  
 "google_vertex_ai_index_endpoint" 
  
 "default" 
  
 { 
  
 display_name 
  
 = 
  
 "sample-endpoint" 
  
 description 
  
 = 
  
 "A sample index endpoint within a VPC network" 
  
 region 
  
 = 
  
 "us-central1" 
  
 network 
  
 = 
  
 "projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}" 
  
 depends_on 
  
 = 
  
 [ 
  
 google_service_networking_connection.default 
  
 ] 
 } 
 resource 
  
 "google_service_networking_connection" 
  
 "default" 
  
 { 
  
 network 
  
 = 
  
 google_compute_network.default.id 
  
 service 
  
 = 
  
 "servicenetworking.googleapis.com" 
  
 reserved_peering_ranges 
  
 = 
  
 [ 
 google_compute_global_address.default.name 
 ] 
 # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729 
  
 deletion_policy 
  
 = 
  
 "ABANDON" 
 } 
 resource 
  
 "google_compute_global_address" 
  
 "default" 
  
 { 
  
 name 
  
 = 
  
 "sample-address" 
  
 purpose 
  
 = 
  
 "VPC_PEERING" 
  
 address_type 
  
 = 
  
 "INTERNAL" 
  
 prefix_length 
  
 = 
  
 16 
  
 network 
  
 = 
  
 google_compute_network.default.id 
 } 
 resource 
  
 "google_compute_network" 
  
 "default" 
  
 { 
  
 name 
  
 = 
  
 "sample-network" 
 } 
 data 
  
 "google_project" 
  
 "project" 
  
 {} 
 # Cloud Storage bucket name must be unique 
 resource 
  
 "random_id" 
  
 "default" 
  
 { 
  
 byte_length 
  
 = 
  
 8 
 } 
 # Create a Cloud Storage bucket 
 resource 
  
 "google_storage_bucket" 
  
 "bucket" 
  
 { 
  
 name 
  
 = 
  
 "vertex-ai-index-bucket-${random_id.default.hex}" 
  
 location 
  
 = 
  
 "us-central1" 
  
 uniform_bucket_level_access 
  
 = 
  
 true 
 } 
 # Create index content 
 resource 
  
 "google_storage_bucket_object" 
  
 "data" 
  
 { 
  
 name 
  
 = 
  
 "contents/data.json" 
  
 bucket 
  
 = 
  
 google_storage_bucket.bucket.name 
  
 content 
  
 = 
  
<< EOF 
 { 
 "id" 
 : 
  
 "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline" 
 ]}]} 
 { 
 "id" 
 : 
  
 "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine" 
 ]}]} 
 EOF 
 } 
 resource 
  
 "google_vertex_ai_index" 
  
 "default" 
  
 { 
  
 region 
  
 = 
  
 "us-central1" 
  
 display_name 
  
 = 
  
 "sample-index-batch-update" 
  
 description 
  
 = 
  
 "A sample index for batch update" 
  
 labels 
  
 = 
  
 { 
  
 foo 
  
 = 
  
 "bar" 
  
 } 
  
 metadata 
  
 { 
  
 contents_delta_uri 
  
 = 
  
 "gs://${google_storage_bucket.bucket.name}/contents" 
  
 config 
  
 { 
  
 dimensions 
  
 = 
  
 2 
  
 approximate_neighbors_count 
  
 = 
  
 150 
  
 distance_measure_type 
  
 = 
  
 "DOT_PRODUCT_DISTANCE" 
  
 algorithm_config 
  
 { 
  
 tree_ah_config 
  
 { 
  
 leaf_node_embedding_count 
  
 = 
  
 500 
  
 leaf_nodes_to_search_percent 
  
 = 
  
 7 
  
 } 
  
 } 
  
 } 
  
 } 
  
 index_update_method 
  
 = 
  
 "BATCH_UPDATE" 
  
 timeouts 
  
 { 
  
 create 
  
 = 
  
 "2h" 
  
 update 
  
 = 
  
 "1h" 
  
 } 
 } 
 

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  def 
  
 vector_search_mutate_deployed_index 
 ( 
  
 project: 
  
 str 
 , 
  
 location: 
  
 str 
 , 
  
 index_endpoint_name: 
  
 str 
 , 
  
 deployed_index_id: 
  
 str 
 , 
  
 min_replica_count: 
  
 int 
 , 
  
 max_replica_count: 
  
 int 
 , 
 ) 
  
 - 
>  
 None: 
  
 """Mutate the deployment resources of an already deployed index. 
  
 Args: 
  
 project 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Project 
  
 ID 
  
 location 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 region 
  
 name 
  
 index_endpoint_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Index 
  
 endpoint 
  
 to 
  
 run 
  
 the 
  
 query 
  
 against 
 . 
  
 deployed_index_id 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 ID 
  
 of 
  
 the 
  
 DeployedIndex 
  
 to 
  
 run 
  
 the 
  
 queries 
  
 against 
 . 
  
 min_replica_count 
  
 ( 
 int 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 minimum 
  
 number 
  
 of 
  
 replicas 
  
 to 
  
 deploy 
 . 
  
 max_replica_count 
  
 ( 
 int 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 maximum 
  
 number 
  
 of 
  
 replicas 
  
 to 
  
 deploy 
 . 
  
 """ 
  
 # 
  
 Initialize 
  
 the 
  
 Vertex 
  
 AI 
  
 client 
  
 aiplatform 
 . 
 init 
 ( 
 project 
 = 
 project 
 , 
  
 location 
 = 
 location 
 ) 
  
 # 
  
 Create 
  
 the 
  
 index 
  
 endpoint 
  
 instance 
  
 from 
  
 an 
  
 existing 
  
 endpoint 
  
 index_endpoint 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 ( 
  
 index_endpoint_name 
 = 
 index_endpoint_name 
  
 ) 
  
 # 
  
 Mutate 
  
 the 
  
 deployed 
  
 index 
  
 index_endpoint 
 . 
 mutate_deployed_index 
 ( 
  
 deployed_index_id 
 = 
 deployed_index_id 
 , 
  
 min_replica_count 
 = 
 min_replica_count 
 , 
  
 max_replica_count 
 = 
 max_replica_count 
 , 
  
 ) 
 

Deployment settings that impact performance

The following deployment settings can affect latency, availability, and cost when using Vector Search. This guidance applies to most cases. However, always experiment with your configurations to make sure that they work for your use case.

Setting
Performance impact
Machine type

The hardware selection has a direct interaction with the shard size selected. Depending on shard choices you specified at index creation time, each machine type offers a tradeoff between performance and cost.

Reference the pricing page to determine the hardware available and pricing. In general, performance increases in the following order:

  • E2 standard
  • E2 highmem
  • N1 standard
  • N2D standard
Minimum replica count

minReplicaCount reserves a minimum capacity for availability and latency to ensure that the system doesn't have cold start issues when traffic scales up quickly from low levels.

If you have workloads that drop to low levels and then quickly increase to higher levels, consider setting minReplicaCount to a number that can accommodate the initial bursts of traffic.

Maximum replica count
maxReplicaCount primarily lets you control usage cost. You can choose to prevent increasing costs beyond a certain threshold, with the tradeoff of allowing increased latency and reducing availability.

List IndexEndpoints

To list your IndexEndpoint resources and view the information of any associated DeployedIndex instances, run the following code:

gcloud

The following example uses the gcloud ai index-endpoints list command .

Before using any of the command data below, make the following replacements:

  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud  
ai  
index-endpoints  
list  
 \ 
  
--region = 
 LOCATION 
  
 \ 
  
--project = 
 PROJECT_ID 

Windows (PowerShell)

gcloud  
ai  
index-endpoints  
list  
 ` 
  
--region = 
 LOCATION 
  
 ` 
  
--project = 
 PROJECT_ID 

Windows (cmd.exe)

gcloud  
ai  
index-endpoints  
list  
^  
--region = 
 LOCATION 
  
^  
--project = 
 PROJECT_ID 

REST

Before using any of the request data, make the following replacements:

  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .
  • PROJECT_NUMBER : Your project's automatically generated project number .

HTTP method and URL:

GET https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/indexEndpoints

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
 "indexEndpoints": [
   {
     "name": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
",
     "displayName": " INDEX_ENDPOINT_DISPLAY_NAME 
",
     "deployedIndexes": [
       {
         "id": " DEPLOYED_INDEX_ID 
",
         "index": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexes/ INDEX_ID 
",
         "displayName": " DEPLOYED_INDEX_DISPLAY_NAME 
",
         "createTime": "2021-06-04T02:23:40.178286Z",
         "privateEndpoints": {
           "matchGrpcAddress": " GRPC_ADDRESS 
"
         },
         "indexSyncTime": "2022-01-13T04:22:00.151916Z",
         "automaticResources": {
           "minReplicaCount": 2,
           "maxReplicaCount": 10
         }
       }
     ],
     "etag": "AMEw9yP367UitPkLo-khZ1OQvqIK8Q0vLAzZVF7QjdZ5O3l7Zow-mzBo2l6xmiuuMljV",
     "createTime": "2021-03-17T04:47:28.460373Z",
     "updateTime": "2021-06-04T02:23:40.930513Z",
     "network": " VPC_NETWORK_NAME 
"
   }
 ]
}

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  def 
  
 vector_search_list_index_endpoint 
 ( 
  
 project: 
  
 str 
 , 
  
 location: 
  
 str 
 ) 
  
 - 
>  
 List 
 [ 
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 ] 
 : 
  
 """List vector search index endpoints. 
  
 Args: 
  
 project 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Project 
  
 ID 
  
 location 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 region 
  
 name 
  
 Returns: 
  
 List 
  
 of 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
  
 """ 
  
 # 
  
 Initialize 
  
 the 
  
 Vertex 
  
 AI 
  
 client 
  
 aiplatform 
 . 
 init 
 ( 
 project 
 = 
 project 
 , 
  
 location 
 = 
 location 
 ) 
  
 # 
  
 List 
  
 Index 
  
 Endpoints 
  
 return 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 . 
 list 
 () 
 

Console

Use these instructions to view a list of your index endpoints.

  1. In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search

    Go to Vector Search

  2. On the top of the page, select the Index endpoint tab.
  3. All of the existing index endpoints are displayed.

For more information, see the reference documentation for IndexEndpoint .

Undeploy an index

To undeploy an index, run the following code:

gcloud

The following example uses the gcloud ai index-endpoints undeploy-index command .

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud  
ai  
index-endpoints  
undeploy-index  
 INDEX_ENDPOINT_ID 
  
 \ 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 \ 
  
--region = 
 LOCATION 
  
 \ 
  
--project = 
 PROJECT_ID 

Windows (PowerShell)

gcloud  
ai  
index-endpoints  
undeploy-index  
 INDEX_ENDPOINT_ID 
  
 ` 
  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
 ` 
  
--region = 
 LOCATION 
  
 ` 
  
--project = 
 PROJECT_ID 

Windows (cmd.exe)

gcloud  
ai  
index-endpoints  
undeploy-index  
 INDEX_ENDPOINT_ID 
  
^  
--deployed-index-id = 
 DEPLOYED_INDEX_ID 
  
^  
--region = 
 LOCATION 
  
^  
--project = 
 PROJECT_ID 

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • DEPLOYED_INDEX_ID : A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .
  • PROJECT_NUMBER : Your project's automatically generated project number .

HTTP method and URL:

POST https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
:undeployIndex

Request JSON body:

{
 "deployed_index_id": " DEPLOYED_INDEX_ID 
"
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
 "name": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
/operations/ OPERATION_ID 
",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UndeployIndexOperationMetadata",
   "genericMetadata": {
     "createTime": "2022-01-13T04:09:56.641107Z",
     "updateTime": "2022-01-13T04:09:56.641107Z"
   }
 }
}

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  def 
  
 vector_search_undeploy_index 
 ( 
  
 project: 
  
 str 
 , 
  
 location: 
  
 str 
 , 
  
 index_endpoint_name: 
  
 str 
 , 
  
 deployed_index_id: 
  
 str 
 , 
 ) 
  
 - 
>  
 None: 
  
 """Mutate the deployment resources of an already deployed index. 
  
 Args: 
  
 project 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Project 
  
 ID 
  
 location 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 region 
  
 name 
  
 index_endpoint_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Index 
  
 endpoint 
  
 to 
  
 run 
  
 the 
  
 query 
  
 against 
 . 
  
 deployed_index_id 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 ID 
  
 of 
  
 the 
  
 DeployedIndex 
  
 to 
  
 run 
  
 the 
  
 queries 
  
 against 
 . 
  
 """ 
  
 # 
  
 Initialize 
  
 the 
  
 Vertex 
  
 AI 
  
 client 
  
 aiplatform 
 . 
 init 
 ( 
 project 
 = 
 project 
 , 
  
 location 
 = 
 location 
 ) 
  
 # 
  
 Create 
  
 the 
  
 index 
  
 endpoint 
  
 instance 
  
 from 
  
 an 
  
 existing 
  
 endpoint 
  
 index_endpoint 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 ( 
  
 index_endpoint_name 
 = 
 index_endpoint_name 
  
 ) 
  
 # 
  
 Undeploy 
  
 the 
  
 index 
  
 index_endpoint 
 . 
 undeploy_index 
 ( 
  
 deployed_index_id 
 = 
 deployed_index_id 
 , 
  
 ) 
 

Console

Use these instructions to undeploy an index.

  1. In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search

    Go to Vector Search

  2. A list of your active indexes is displayed.
  3. Select the index you want to undeploy. The index details page opens.
  4. Under the Deployed indexes section, identify the index endpoint you want to undeploy.
  5. Click the  options menu that is in the same row as the index endpoint and select Undeploy .
  6. A confirmation screen opens. Click Undeploy . Note: It can take up to 30 minutes to be undeployed.

Delete an IndexEndpoint

Before you delete an IndexEndpoint , you must undeploy all indexes deploy to the endpoint.

gcloud

The following example uses the gcloud ai index-endpoints delete command.

Before using any of the command data below, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud  
ai  
index-endpoints  
delete  
 INDEX_ENDPOINT_ID 
  
 \ 
  
--region = 
 LOCATION 
  
 \ 
  
--project = 
 PROJECT_ID 

Windows (PowerShell)

gcloud  
ai  
index-endpoints  
delete  
 INDEX_ENDPOINT_ID 
  
 ` 
  
--region = 
 LOCATION 
  
 ` 
  
--project = 
 PROJECT_ID 

Windows (cmd.exe)

gcloud  
ai  
index-endpoints  
delete  
 INDEX_ENDPOINT_ID 
  
^  
--region = 
 LOCATION 
  
^  
--project = 
 PROJECT_ID 

REST

Before using any of the request data, make the following replacements:

  • INDEX_ENDPOINT_ID : The ID of the index endpoint.
  • LOCATION : The region where you are using Vertex AI.
  • PROJECT_ID : Your Google Cloud project ID .
  • PROJECT_NUMBER : Your project's automatically generated project number .

HTTP method and URL:

DELETE https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
 "name": "projects/ PROJECT_NUMBER 
/locations/ LOCATION 
/indexEndpoints/ INDEX_ENDPOINT_ID 
/operations/ OPERATION_ID 
",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata",
   "genericMetadata": {
     "createTime": "2022-01-13T04:36:19.142203Z",
     "updateTime": "2022-01-13T04:36:19.142203Z"
   }
 },
 "done": true,
 "response": {
   "@type": "type.googleapis.com/google.protobuf.Empty"
 }
}

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .

  def 
  
 vector_search_delete_index_endpoint 
 ( 
  
 project: 
  
 str 
 , 
  
 location: 
  
 str 
 , 
  
 index_endpoint_name: 
  
 str 
 , 
  
 force 
 : 
  
 bool 
  
 = 
  
 False 
 ) 
  
 - 
>  
 None: 
  
 """Delete a vector search index endpoint. 
  
 Args: 
  
 project 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Project 
  
 ID 
  
 location 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 The 
  
 region 
  
 name 
  
 index_endpoint_name 
  
 ( 
 str 
 ) 
 : 
  
 Required 
 . 
  
 Index 
  
 endpoint 
  
 to 
  
 run 
  
 the 
  
 query 
  
 against 
 . 
  
 force 
  
 ( 
 bool 
 ) 
 : 
  
 Required 
 . 
  
 If 
  
 true 
 , 
  
 undeploy 
  
 any 
  
 deployed 
  
 indexes 
  
 on 
  
 this 
  
 endpoint 
  
 before 
  
 deletion 
 . 
  
 """ 
  
 # 
  
 Initialize 
  
 the 
  
 Vertex 
  
 AI 
  
 client 
  
 aiplatform 
 . 
 init 
 ( 
 project 
 = 
 project 
 , 
  
 location 
 = 
 location 
 ) 
  
 # 
  
 Create 
  
 the 
  
 index 
  
 endpoint 
  
 instance 
  
 from 
  
 an 
  
 existing 
  
 endpoint 
  
 index_endpoint 
  
 = 
  
 aiplatform 
 . 
 MatchingEngineIndexEndpoint 
 ( 
  
 index_endpoint_name 
 = 
 index_endpoint_name 
  
 ) 
  
 # 
  
 Delete 
  
 the 
  
 index 
  
 endpoint 
  
 index_endpoint 
 . 
 delete 
 ( 
 force 
 = 
 force 
 ) 
 

Console

Use these instructions to delete an index endpoint.

  1. In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search

    Go to Vector Search

  2. On the top of the page, select the Index endpoints tab.
  3. All of the existing index endpoints are displayed.
  4. Click the  options menu that is in the same row as the index endpoint you want to delete and select Delete .
  5. A confirmation screen opens. Click Delete . Your index endpoint is now deleted.
Create a Mobile Website
View Site in Mobile | Classic
Share by: