Data Objects

In Vector Search 2.0, Collections store data as individual JSON objects called Data Objects. This page provides information on how to create Data Objects or import them from Cloud Storage buckets, and how to update and delete them.

Creating a Data Object

The following example demonstrates adding a Data Object to a Collection named movies .

  curl -X POST \ 
 'https://vectorsearch.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/movies/dataObjects?dataObjectId=the-shawshank-redemption' \ 
 -H 'Bearer $(gcloud auth print-access-token)' \ 
 -H 'Content-Type: application/json' \ 
 -d '{ \ 
 "data": { \ 
 "title": "The Shawshank Redemption", \ 
 "genre": "Drama", \ 
 "year": 1994, \ 
 "director": "Frank Darabont" \ 
 }, \ 
 "vectors": { \ 
 "plot_embedding": { \ 
 "dense": { \ 
 "values": [ \ 
 0.4752082440607731, \ 
 0.09026746166854707, \ 
 0.8752307753619009 \ 
 ] \ 
 } \ 
 }, \ 
 "genre_embedding": { \ 
 "dense": { \ 
 "values": [ \ 
 0.38638010860523064, \ 
 0.739343471733759, \ 
 0.16189056837017107, \ 
 0.5271366865924485 \ 
 ] \ 
 } \ 
 }, \ 
 "soundtrack_embedding": { \ 
 "dense": { \ 
 "values": [ \ 
 0.5920451749052875, \ 
 0.08301644173787519, \ 
 0.1264733498775969, \ 
 0.6196429624200321, \ 
 0.4925828581737443 \ 
 ] \ 
 } \ 
 }, \ 
 "sparse_embedding": { \ 
 "sparse": { \ 
 "values": [ \ 
 1, \ 
 6, \ 
 3, \ 
 2, \ 
 8, \ 
 5, \ 
 2 \ 
 ], \ 
 "indices": [ \ 
 4065, \ 
 13326, \ 
 17377, \ 
 25918, \ 
 28105, \ 
 32683, \ 
 42998 \ 
 ] \ 
 } \ 
 } \ 
 } \ 
 }' 
 

Embedding fields that have an auto-embedding specified in the Collection Schema are automatically populated. You can also bring your own embeddings (BYOE) to set vector field values that are not automatically populated.

Importing Data Objects

The following example demonstrates how to import a Data Object from Cloud Storage into a Collection named movies .

  curl -X POST \ 
 "https://vectorsearch.googleapis.com/v1main/projects/PROJECT_ID/locations/LOCATION/collections/movies:importDataObjects" \ 
 -H "Authorization: Bearer $(gcloud auth print-access-token)" \ 
 -H "Content-Type: application/json" \ 
 -d '{ \ 
 "gcs_import": { \ 
 "contents_uri": "gs://your-bucket/path/to/your-data.jsonl", \ 
 "error_uri": "gs://your-bucket/path/to/import-errors/" \ 
 } \ 
 }' 
 

For very large datasets, you can bulk import data from a Cloud Storage bucket. The file format for Vector Search 2.0 is JSONL, where each line is a JSON object having three top level properties: data_object_id , data , and vectors .

The following provides an example of the JSONL with the required properties.

  { 
  
 "data_object_id" 
 : 
  
 "movie-789" 
 , 
  
 "data" 
 : 
  
 { 
  
 "title" 
 : 
 "The Shawshank Redemption" 
 , 
  
 "plot" 
 : 
  
 "..." 
 , 
  
 "year" 
 : 
 1994 
 , 
  
 "avg_rating" 
 : 
  
 8.5 
 , 
  
 "movie_runtime_info" 
 : 
  
 { 
  
 "hours" 
 : 
  
 2 
 , 
  
 "minutes" 
 : 
  
 5 
  
 }, 
  
 }, 
  
 "vectors" 
 : 
  
 { 
  
 "title_embedding" 
 : 
  
 [ 
 -0.23 
 , 
  
 0.88 
 , 
  
 0.11 
 , 
  
 ... 
 ], 
  
 "sparse_embedding" 
 : 
  
 { 
  
 "values" 
 : 
  
 [ 
 0.01 
 , 
  
 -0.93 
 , 
  
 0.27 
 , 
  
 ... 
 ], 
  
 "indices" 
 : 
  
 [ 
 23 
 , 
  
 83 
 , 
  
 131 
 , 
  
 ... 
 ] 
  
 } 
  
 } 
 } 
 

Get a Data Object

The following example demonstrates how to get a Data Object named the-shawshank-redemption from the movies collection.

  curl -X GET \ 
 'https://vectorsearch.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/movies/dataObjects/the-shawshank-redemption'  \ 
 -H 'Bearer $(gcloud auth print-access-token)' \ 
 -H 'Content-Type: application/json' 
 

Updating a Data Object

The following example demonstrates how to update the title field in the Data Object with the name the-shawshank-redemption in the movies Collection.

  curl -X PATCH \ 
 'https://vectorsearch.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/movies/dataObjects/the-shawshank-redemption' \ 
 -H 'Bearer $(gcloud auth print-access-token)' \ 
 -H 'Content-Type: application/json' \ 
 -d '{ \ 
 "data": { \ 
 "title": "The Shawshank Redemption (updated)" \ 
 }, \ 
 "vectors": { \ 
 "plot_embedding": { \ 
 "dense": { \ 
 "values": [ \ 
 1.0, \ 
 1.0, \ 
 1.0 \ 
 ] \ 
 } \ 
 } \ 
 } \ 
 }' 
 

Delete Data Objects

You can delete individual Data Objects by name or batch delete Data Objects that match a filter.

The following shows how to delete the Data Object the-shawshank-redemption from the movies Collection.

  curl -X DELETE \ 
 'https://vectorsearch.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/movies/dataObjects/the-shawshank-redemption' \ 
 -H 'Bearer $(gcloud auth print-access-token)' \ 
 -H 'Content-Type: application/json' 
 

What's next?

Create a Mobile Website
View Site in Mobile | Classic
Share by: