Access Elasticsearch data from AlloyDB for PostgreSQL

Access and search data stored in Elasticsearch by creating a foreign data wrapper (FDW) and foreign table in AlloyDB for PostgreSQL.

Limitations

Before connecting AlloyDB to Elasticsearch, acknowledge the following limitations:

  • Elasticsearch integration is only available on PostgreSQL major version 17 and up.

  • AlloyDB reads, but doesn't write to, Elasticsearch data.

  • You are responsible for synchronizing data between AlloyDB and Elasticsearch.

  • Specialized Elasticsearch types, such as geo_point are not supported. For the full list of supported data types, see Supported data types .

Before you begin

Before you begin, ensure that you've completed the following:

Store Elasticsearch API key in Secret Manager

AlloyDB stores and reads your Elasticsearch API key from Secret Manager. For more information on how to use Secret Manager, see Create and access a secret using Secret Manager .

Ensure that you give your AlloyDB service account permission to read the secret. For more information, see Create and access a secret using Secret Manager .

Enable and configure external_search_fdw extension

To start your integration with Elasticsearch, complete the following instructions to enable and configure the external_search_fdw AlloyDB extension:

  1. Enable the external_search_fdw extension.

      CREATE 
      
     EXTENSION 
      
     external_search_fdw 
     ; 
     
    
  2. Configure access to your Elasticsearch cluster through a foreign data server.

      CREATE 
      
     SERVER 
      
      ELASTICSEARCH_SERVER_NAME 
     
     FOREIGN 
      
     DATA 
      
     WRAPPER 
      
     external_search_fdw 
     OPTIONS 
      
     ( 
     server 
      
     ' ELASTICSEARCH_SERVER_HOST_PORT 
    ' 
     , 
      
     search_provider 
      
     'elastic' 
     , 
      
     auth_mode 
      
     'secret_manager' 
     , 
      
     auth_method 
      
     ' AUTH_METHOD 
    ' 
     , 
      
     secret_path 
      
     ' SECRET_PATH 
    ' 
     , 
      
     max_deadline_ms 
      
     ' MAX_DEADLINE 
    ' 
     , 
      
     pagination_num_results 
      
     ' PAGINATION_NUM_RESULTS 
    ' 
     , 
      
     pagination_context_timeout_ms 
      
     ' PAGINATION_CONTEXT_TIMEOUT 
    ' 
     ); 
     
    

    Replace the following variables:

    • ELASTICSEARCH_SERVER_NAME : name for your foreign data server. For example, my-elasticsearch-server .

    • ELASTICSEARCH_SERVER_HOST_PORT : Public-facing URL for your Elasticsearch cluster. For example, https://node1.elastic.test.com:9200 .

    • AUTH_METHOD : type of authentication to use. You can choose between the following options:

    • SECRET_PATH : Secret Manager path to your Elasticsearch authentication credentials. For example, projects/123456789012/secrets/apikey/versions/1 . 123456789012 represents your Google Cloud project ID.

    • (Optional) MAX_DEADLINE : maximum amount of time, in milliseconds, that AlloyDB waits for a response from Elasticsearch. You should set this value based on the locations of your AlloyDB and Elasticsearch instances. The default value is 10000 .

    • (Optional) PAGINATION_NUM_RESULTS : max number of results fetched per batch from Elasticsearch. If more results are requested, AlloyDB retrieves the results in multiple batches of this size. The default value is 32 .

    • (Optional) PAGINATION_CONTEXT_TIMEOUT : amount of time, in milliseconds, that Elasticsearch keeps the pagination request context active. The default value is 30000 .

  3. Define the PostgreSQL user mapping for the Elasticsearch server. Note that PostgreSQL FDWs require this user mapping to function. AlloyDB authenticates using the REST authorization header.

      CREATE 
      
     USER 
      
     MAPPING 
      
     FOR 
      
     CURRENT_USER 
      
     SERVER 
      
      ELASTICSEARCH_SERVER_NAME 
     
     ; 
     
    
  4. Configure the schema for your Elasticsearch data through a foreign data table.

      CREATE 
      
     FOREIGN 
      
     TABLE 
      
      ELASTICSEARCH_FD_TABLE 
     
     ( 
      
     metadata 
      
     external_search_fdw_schema 
     . 
     OpaqueMetadata 
     , 
      
      ELASTICSEARCH_FIELDS 
     
     ) 
      
     SERVER 
      
      ELASTICSEARCH_SERVER_NAME 
     
      
     OPTIONS 
     ( 
     remote_table_name 
      
     ' ELASTICSEARCH_INDEX_NAME 
    ' 
     ); 
     
    

    Replace the following new variables:

    • ELASTICSEARCH_FD_TABLE : name of the foreign data table that represents your Elasticsearch table. For example, my-fd-elasticsearch-table .

    • ELASTICSEARCH_FIELDS : comma-separated list of Elasticsearch field schema definitions in the following format: elasticsearch_field_name PG_DATA_TYPE . For example, elasticsearch_boolean_field_name BOOLEAN, elasticsearch_double_field_name DOUBLE PRECISION . These fields must match the field names in Elasticsearch unless the remote_field_name option is appended. For example, elasticsearch_foo OPTIONS (remote_field_name 'elasticsearch_FOO') .

      For the list of Elasticsearch data types that can be defined for AlloyDB, see Supported data types .

    • ELASTICSEARCH_INDEX_NAME : name of your Elasticsearch index. For example, my-elasticsearch-index .

Supported data types

AlloyDB supports the following Elasticsearch data types:

Data type(s) PostgreSQL type
alias PostgreSQL type for the field that alias is referencing
binary bytea
boolean BOOLEAN

byte ,

short

SMALLINT
date TIMESTAMPTZ

double ,

scaled_float

DOUBLE PRECISION

float ,

half_float

REAL
integer INTEGER
long BIGINT

object ,

flattened

jsonb

text ,

annotated_text ,

keyword ,

constant_keyword ,

wildcard

TEXT
unsigned_long NUMERIC

Query your Elasticsearch data

AlloyDB takes SQL queries and converts them to Elasticsearch REST API queries. During this conversion, AlloyDB attempts to push down as much query logic as possible without changing the query's identity, including the SQL query's LIMIT . However, there are cases where you might specify not to push down certain Elasticsearch fields or where query logic cannot be pushed down. For example, LIKE and other text-matching operators cannot be pushed down. For more examples of what can and can't be pushed down, see Pushdown examples .

In scenarios where the LIMIT is set higher than pagination_num_results or where LIMIT is not specified or cannot be pushed down, AlloyDB uses the Scroll API , which can be resource-intensive.

Because the Scroll API can be resource-intensive, we recommend examining your queries using EXPLAIN VERBOSE to see which APIs are used. Limiting the use of the Scroll API and using LIMIT improves performance.

To query your Elasticsearch data, you have the following options:

  • Standard SQL queries
  • Query DSL
  • Hybrid searches

Standard SQL queries

Standard SQL queries can be written using Elasticsearch's Lucene syntax.

To perform a standard SQL query, see the following example query:

  SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
  ELASTICSEARCH_FD_TABLE 
 
 WHERE 
  
  FILTER 
 
 ORDER 
  
 BY 
  
 metadata 
  
< @ 
>  
 ' QUERY 
' 
 ; 
 

Replace the following variables:

  • ELASTICSEARCH_FD_TABLE : name of the foreign data table that represents your Elasticsearch table. For example, my-fd-elasticsearch-table .

  • (Optional) FILTER : filter to apply to your Elasticsearch query. For example, AND qubits < 105 .

  • QUERY : query to send to Elasticsearch. For a few example queries, see the following list:

    • body:quantum body:computing
    • body:(quantum computing)
    • body:(quantum AND computing)
    • body:"quantum computing"
    • body:"quantum computing" AND qubits:[* TO 105}

Query DSL

Query DSL is Elasticsearch's full-featured, JSON-style query language recommended for advanced use cases. Query DSL lets you perform complex searches, filtering, and aggregations that can't be expressed in SQL query syntax.

To perform queries using Query DSL , see the following example query:

  SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
  ELASTICSEARCH_FD_TABLE 
 
 ORDER 
  
 BY 
  
 metadata 
  
< @ 
>  
 $${ 
  
 "query" 
 : 
  
 { 
  
 "bool" 
 : 
  
 { 
  
 "must" 
 : 
  
 [ 
  
 { 
  
 "query_string" 
 : 
  
 { 
  
 "query" 
  
 : 
  
 " QUERY 
" 
  
 } 
  
 } 
  
 ], 
  
 "filter" 
 : 
  
 [ 
  
 { 
  
 "range" 
 : 
  
 { 
  
  
 "id" 
 : 
  
 { 
  
  
 "lt" 
 : 
  
 "10" 
  
 } 
  
 } 
  
 } 
  
 ] 
  
 } 
  
 } 
 , 
  
 "sort" 
 : 
  
 [ 
  
 { 
  
 "id" 
 : 
  
 { 
  
 "order" 
 : 
  
 "desc" 
  
 } 
  
 } 
  
 ] 
  
 }$$ 
 LIMIT 
  
 1 
 ; 
 

Replace the following variables:

  • ELASTICSEARCH_FD_TABLE : name of the foreign data table that represents your Elasticsearch table. For example, my-fd-elasticsearch-table .

  • QUERY : query to send to Elasticsearch. For example, "elasticsearch_field_name:\"quantum computing\" OR int_field:[* TO 3]" .

Note that for Query DSL, you're only expected to propagate the query , filter , and sort expressions.

To perform a hybrid search on your Elasticsearch data, see the following example search:

  SELECT 
  
 * 
 FROM 
  
 ai 
 . 
 hybrid_search 
 ( 
  
 ARRAY 
 [ 
  
 '{"limit": LIMIT 
, 
 "data_type": "external_search_fdw", 
 "weight": WEIGHT 
, 
 "table_name": " ELASTICSEARCH_FD_TABLE 
", 
 "key_column": " DOCUMENT_ID_COLUMN_NAME 
", 
 "query_text_input": QUERY 
}' 
 :: 
 jsonb 
 ], 
  
 NULL 
 :: 
 TEXT 
 , 
  
 'RRF' 
 , 
  
 FALSE 
 ) 
 ORDER 
  
 BY 
  
 score 
  
 DESC 
 ; 
 

Replace the following variables:

  • LIMIT : number of results to return. For example, 3 .

  • WEIGHT : contribution of this search entry to the overall Reciprocal Rank Fusion (RRF).

  • ELASTICSEARCH_FD_TABLE : name of the foreign data table that represents your Elasticsearch table. For example, my-fd-elasticsearch-table .

  • DOCUMENT_ID_COLUMN_NAME : name of the document ID column.

  • QUERY : query to send to Elasticsearch. For example, "elasticsearch_field_name:\"quantum computing\"" searches for the phrase "quantum computing" in the elasticsearch_field_name field. All query types mentioned in Supported data types can be used in your query.

For more information on the parameters available for hybrid searches, see Hybrid search function parameters .

Pushdown examples

To make queries more efficient, AlloyDB attempts to push down the following aspects of the query directly into the API call made to Elasticsearch:

  • SELECT fields
  • WHERE filters
  • ORDER BY sorts
  • LIMIT

For example queries that illustrate which aspects AlloyDB is and isn't able to push down, see the following table.

Query type
Query example
Query elements pushed down
Unfiltered queries
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 ORDER 
  
 BY 
  
 metadata 
  
 <@> 
  
 'body:foo' 
  
 DESC 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
  • ORDER BY ... DESC sort
  • LIMIT
Exact text match
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 body 
  
 = 
  
 'foo' 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
  • WHERE filter
  • LIMIT
Single-field expressions
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 id 
  
 > 
  
 10 
 ORDER 
  
 BY 
  
 metadata 
  
 <@> 
  
 'body:foo' 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
  • WHERE filter
Constant expressions
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 id 
  
 > 
  
 ( 
 1 
 + 
 1 
 ) 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
  • WHERE filter
  • LIMIT
Expressions with functions
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 id 
  
 > 
  
 CEIL 
 ( 
 3 
 . 
 14 
 ) 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
Multi-field expressions
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 dbl_field 
  
 < 
  
 flt_field 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
Score filtering
 SELECT 
  
 id 
 , 
  
 body 
 , 
  
 ( 
 metadata 
  
 <@> 
  
 'body:bar' 
 ) 
  
 AS 
  
 score 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 score 
  
 > 
  
 0 
 . 
 5 
 ORDER 
  
 by 
  
 score 
  
 desc 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
  • ORDER BY ... DESC sort
LIKE and similar operators
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 id 
  
 > 
  
 10 
  
 AND 
  
 body 
  
 LIKE 
  
 '%foo%' 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
  • WHERE id > 10 filter
Raw queries
 SELECT 
  
 id 
 , 
  
 body 
 FROM 
  
 elasticsearch_table 
 WHERE 
  
 id 
  
 < 
  
 10 
 ORDER 
  
 BY 
  
 metadata 
  
 <@> 
  
 $${ 
 "query" 
 : 
  
 { 
  
 "match_all" 
 : 
  
 {}}}$$ 
  
 DESC 
 LIMIT 
  
 10 
 ; 
  • SELECT fields
  • ORDER BY ... DESC sort
Design a Mobile Site
View Site in Mobile | Classic
Share by: