1
embedding() 
function 
works together
with your table-stored data and pgvector 
feature. The example
uses plain-text input to fetch a result from a database that relies on large
language model (LLM)-driven semantic parsing of the text's meaning. An example scenario
Imagine a database running on AlloyDB with the following aspects:
-  The database contains a table, items. Each row in this table describes an item that your business sells.
-  The itemstable contains a column,complaints. ThisTEXTcolumn stores buyer complaints logged about each item.
-  The database integrates with the Vertex AI Model Garden, giving it access to the textembedding-geckoEnglish models.
Even though this database stores complaints about items, these complaints are stored as plain text, making it challenging to query. For example, if you want to see which items have the most complaints from customers who received the wrong color of merchandise, then you can perform ordinary SQL queries on the table, looking for various keyword matches. However, this approach only matches rows that contain those exact keywords.
For example, a basic SQL query such as SELECT * FROM item WHERE complaints LIKE
"%wrong color%" 
doesn't return a row whose complaints 
field contains only The picture shows a blue one, but the one I received was red 
.
SQL queries using LLM-powered embeddings can help bridge this gap. By applying embeddings, you can query the table in this example for items whose complaints have semantic similarity to a given text prompt, such as "It was the wrong color".
The following steps show how to enable this in the example setup described earlier.
Before you begin
Make sure that you meet the following requirements:
Required database extension
-  Ensure that the following extensions are installed on your AlloyDB database. -  google_ml_integrationextension
-  vectorextension, version0.5.0.google-1or later
 
-  
For more information about installing and managing extensions, see Configure database extensions .
- Set the google_ml_integration.enable_model_supportdatabase flag tooff.
Set up model access
Before you can generate embeddings from an AlloyDB database, you must configure AlloyDB to work with a text embedding model.
To work with the cloud-based textembeddings-gecko 
model, you need to integrate your database with
with Vertex AI 
.
Prepare the table
Before you run LLM-based queries on the items 
table's content, you must
prepare the table to store and index embeddings based on your existing
data.
Create a column to store embeddings
Add a column to the table for storing embeddings.
  ALTER 
  
 TABLE 
  
 items 
  
 ADD 
  
 COLUMN 
  
 complaint_embedding 
  
 vector 
 ( 
 768 
 ); 
 
 
This example specifies 768 
as an argument, because that is the number of
dimensions supported by the textembedding-gecko 
English models. For more information, see Generate an embedding 
.
The example applies the vector 
data type to the column for ease of using pgvector 
functions and operators with its values.
Populate the new column
If you already have embeddings in CSV 
format, follow the
steps in Store vector embeddings 
to store your embeddings.
Optionally, use the embedding() 
function to populate this new column with
embeddings in case you have text stored in the complaints 
column. In this
example setup,
AlloyDB generates the embeddings using the textembedding-gecko 
model, version 003 
.
  UPDATE 
  
 items 
  
 SET 
  
 complaint_embedding 
  
 = 
  
 embedding 
 ( 
 'textembedding-gecko@003' 
 , 
  
 complaints 
 ); 
 
 
This example implicitly casts the real[] 
return value of embedding() 
into a vector 
value,
in order to store the value into the vector 
column created earlier.
Create an index
To improve performance, add an index to items 
that uses the
scalar-quantization techniques.
  CREATE 
  
 INDEX 
  
 complaint_embed_idx 
  
 ON 
  
 items 
  
 USING 
  
 ivf 
  
 ( 
 complaint_embedding 
  
 vector_l2_ops 
 ) 
  
 WITH 
  
 ( 
 lists 
  
 = 
  
 20 
 , 
  
 quantizer 
  
 = 
  
 'SQ8' 
 ); 
 
 
For more information on creating approximate nearest-neighbor indexes, see Indexing .
Run LLM-powered queries with provided text
You can now make semantic nearest-neighbor queries on the items 
table. The
following query uses the <-> 
operator provided by pgvector 
to sort the
table's rows on semantic proximity to the text It was the wrong color 
and return the top ten complaints. The
query displays the id 
and name 
values of the first sorted row.
  SELECT 
  
 id 
 , 
  
 name 
  
 FROM 
  
 items 
  
 ORDER 
  
 BY 
  
 complaint_embedding 
 :: 
 vector 
  
< - 
>  
 embedding 
 ( 
 'textembedding-gecko@003' 
 , 
  
 'It was the wrong color' 
 ) 
  
 LIMIT 
  
 10 
 ; 
 
 
What's next
- For another example workflow involving AlloyDB and pgvector, see Building AI-powered apps on Google Cloud databases using pgvector, LLMs and LangChain .

