Before you begin
Make sure that you have registered your model endpoint with Model endpoint management. For more information, see Register and call remote AI models in AlloyDB Omni .
Invoke predictions for generic models
Use the google_ml.predict_row() 
SQL function to call a registered generic model endpoint to invoke
predictions.
  SELECT 
  
 google_ml 
 . 
 predict_row 
 ( 
  
 model_id 
  
 = 
>  
 ' MODEL_ID 
' 
 , 
  
 request_body 
  
 = 
>  
 ' REQUEST_BODY 
' 
 ); 
 
 
Replace the following:
-  MODEL_ID: the model ID you defined when registering the model endpoint.
-  REQUEST_BODY: the parameters to the prediction function, in JSON format.
Examples
This section includes some examples for invoking predictions using registered model endpoints.
To generate predictions for a registered gemini-1.5-pro:streamGenerateContent 
model endpoint, run the following statement:
   
 SELECT 
  
 json_array_elements 
 ( 
  
 google_ml 
 . 
 predict_row 
 ( 
  
 model_id 
  
 = 
>  
 'gemini-1.5-pro:streamGenerateContent' 
 , 
  
 request_body 
  
 = 
>  
 '{ "contents": [ { "role": "user", "parts": [ { "text": "For TPCH database schema as mentioned here https://www.tpc.org/TPC_Documents_Current_Versions/pdf/TPC-H_v3.0.1.pdf , generate a SQL query to find all supplier names which are located in the India nation." } ] } ] }' 
 )) 
 - 
>  
 'candidates' 
  
 - 
>  
 0 
  
 - 
>  
 'content' 
  
 - 
>  
 'parts' 
  
 - 
>  
 0 
  
 - 
>  
 'text' 
 ; 
 
 
To generate predictions for a registered facebook/bart-large-mnli 
model endpoint on Hugging Face, run the following statement:
   
 SELECT 
  
 google_ml 
 . 
 predict_row 
 ( 
  
 model_id 
  
 = 
>  
 'facebook/bart-large-mnli' 
 , 
  
 request_body 
  
 = 
>  
 '{ 
 "inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!", 
 "parameters": {"candidate_labels": ["refund", "legal", "faq"]} 
 }' 
  
 ); 
 
 
To generate predictions for a registered Anthropic claude-3-opus-20240229 
model endpoint, run the following statement:
   
 SELECT 
  
 google_ml 
 . 
 predict_row 
 ( 
 'anthropic-opus' 
 , 
  
 '{ 
 "model": "claude-3-opus-20240229", 
 "max_tokens": 1024, 
 "messages": [ 
 {"role": "user", "content": "Hello, world"} 
 ] 
 }' 
 ); 
 
 

