Mix full-text and non-text queries

This page describes how to perform a search that mixes full-text and non-text data.

Perform a mixed full-text and non-text search

Search indexes support full-text, exact match, numeric columns, and JSON/JSONB columns. You can combine text and non-text conditions in the WHERE clause similarly to multi-column search queries. The query optimizer tries to optimize non-text predicates with a search index. If that's not possible, Spanner evaluates the condition for every row that matches the search index. Referenced columns not stored in the search index are fetched from the base table.

Consider the following example:

GoogleSQL

  CREATE 
  
 TABLE 
  
 Albums 
  
 ( 
  
 AlbumId 
  
 STRING 
 ( 
 MAX 
 ) 
  
 NOT 
  
 NULL 
 , 
  
 Title 
  
 STRING 
 ( 
 MAX 
 ), 
  
 Rating 
  
 FLOAT64 
 , 
  
 Genres 
  
 ARRAY<STRING 
 ( 
 MAX 
 )>, 
  
 Likes 
  
 INT64 
 , 
  
 Cover 
  
 BYTES 
 ( 
 MAX 
 ), 
  
 Title_Tokens 
  
 TOKENLIST 
  
 AS 
  
 ( 
 TOKENIZE_FULLTEXT 
 ( 
 Title 
 )) 
  
 HIDDEN 
 , 
  
 Rating_Tokens 
  
 TOKENLIST 
  
 AS 
  
 ( 
 TOKENIZE_NUMBER 
 ( 
 Rating 
 )) 
  
 HIDDEN 
 , 
  
 Genres_Tokens 
  
 TOKENLIST 
  
 AS 
  
 ( 
 TOKEN 
 ( 
 Genres 
 )) 
  
 HIDDEN 
 ) 
  
 PRIMARY 
  
 KEY 
 ( 
 AlbumId 
 ); 
 CREATE 
  
 SEARCH 
  
 INDEX 
  
 AlbumsIndex 
 ON 
  
 Albums 
 ( 
 Title_Tokens 
 , 
  
 Rating_Tokens 
 , 
  
 Genres_Tokens 
 ) 
 STORING 
  
 ( 
 Likes 
 );

PostgreSQL

Spanner PostgreSQL support has the following limitations:

spanner.tokenize_number function only supports the bigint type.
spanner.token doesn't support tokenizing arrays.

  CREATE 
  
 TABLE 
  
 albums 
  
 ( 
  
 albumid 
  
 character 
  
 varying 
  
 NOT 
  
 NULL 
 , 
  
 title 
  
 character 
  
 varying 
 , 
  
 rating 
  
 bigint 
 , 
  
 genres 
  
 character 
  
 varying 
  
 NOT 
  
 NULL 
 , 
  
 likes 
  
 bigint 
 , 
  
 cover 
  
 bytea 
 , 
  
 title_tokens 
  
 spanner 
 . 
 tokenlist 
  
 AS 
  
 ( 
 spanner 
 . 
 tokenize_fulltext 
 ( 
 title 
 )) 
  
 VIRTUAL 
  
 HIDDEN 
 , 
  
 rating_tokens 
  
 spanner 
 . 
 tokenlist 
  
 AS 
  
 ( 
 spanner 
 . 
 tokenize_number 
 ( 
 rating 
 )) 
  
 VIRTUAL 
  
 HIDDEN 
 , 
  
 genres_tokens 
  
 spanner 
 . 
 tokenlist 
  
 AS 
  
 ( 
 spanner 
 . 
 token 
 ( 
 genres 
 )) 
  
 VIRTUAL 
  
 HIDDEN 
 , 
 PRIMARY 
  
 KEY 
 ( 
 albumid 
 )); 
 CREATE 
  
 SEARCH 
  
 INDEX 
  
 albumsindex 
 ON 
  
 albums 
 ( 
 title_tokens 
 , 
  
 rating_tokens 
 , 
  
 genres_tokens 
 ) 
 INCLUDE 
  
 ( 
 likes 
 );

The behavior of queries on this table include the following:

Rating and Genres are included in the search index. Spanner accelerates conditions using search index posting lists. ARRAY_INCLUDES_ANY , ARRAY_INCLUDES_ALL are GoogleSQL functions and are not supported for PostgreSQL dialect.
```
  SELECT 
  
 Album 
 FROM 
  
 Albums 
 WHERE 
  
 Rating 
 > 
 4 
  
 AND 
  
 ARRAY_INCLUDES_ANY 
 ( 
 Genres 
 , 
  
 [ 
 'jazz' 
 ]) 
 
```

The query can combine conjunctions, disjunctions, and negations in any way, including mixing full-text and non-text predicates. This query is fully accelerated by the search index.

  SELECT 
  
 Album 
 FROM 
  
 Albums 
 WHERE 
  
 ( 
 SEARCH 
 ( 
 Title_Tokens 
 , 
  
 'car' 
 ) 
  
 OR 
  
 Rating 
 > 
 4 
 ) 
  
 AND 
  
 NOT 
  
 ARRAY_INCLUDES_ANY 
 ( 
 Genres 
 , 
  
 [ 
 'jazz' 
 ])

Likes is stored in the index, but the schema doesn't request Spanner to build a token index for its possible values. Therefore, the full-text predicate on Title and non-text predicate on Rating is accelerated, but the predicate on Likes isn't. In Spanner, the query fetches all documents with the term "car" in the Title and a rating more than 4, then it filters documents that don't have at least 1000 likes. This query uses a lot of resources if almost all albums have the term "car" in their title and almost all of them have a rating of 5, but few albums have 1000 likes. In such cases, indexing Likes similarly to Rating saves resources.
GoogleSQL
```
  SELECT 
  
 Album 
 FROM 
  
 Albums 
 WHERE 
  
 SEARCH 
 ( 
 Title_Tokens 
 , 
  
 'car' 
 ) 
  
 AND 
  
 Rating 
 > 
 4 
  
 AND 
  
 Likes 
  
> = 
  
 1000 
 
```
PostgreSQL
```
  SELECT 
  
 album 
 FROM 
  
 albums 
 WHERE 
  
 spanner 
 . 
 search 
 ( 
 title_tokens 
 , 
  
 'car' 
 ) 
  
 AND 
  
 rating 
 > 
 4 
  
 AND 
  
 likes 
  
> = 
  
 1000 
 
```

Cover isn't stored in the index. The following query does a back join between AlbumsIndex and Albums to fetch Cover for all matching albums.

GoogleSQL

  SELECT 
  
 AlbumId 
 , 
  
 Cover 
 FROM 
  
 Albums 
 WHERE 
  
 SEARCH 
 ( 
 Title_Tokens 
 , 
  
 'car' 
 ) 
  
 AND 
  
 Rating 
 > 
 4

PostgreSQL

  SELECT 
  
 albumid 
 , 
  
 cover 
 FROM 
  
 albums 
 WHERE 
  
 spanner 
 . 
 search 
 ( 
 title_tokens 
 , 
  
 'car' 
 ) 
  
 AND 
  
 rating 
 > 
 4

What's next

Learn about full-text search queries .
Learn how to rank search results .
Learn how to perform a substring search .
Learn how to paginate search results .
Learn how to search multiple columns .

Mix full-text and non-text queries Stay organized with collections Save and categorize content based on your preferences.

Perform a mixed full-text and non-text search

GoogleSQL

PostgreSQL

GoogleSQL

PostgreSQL

GoogleSQL

PostgreSQL

What's next

Mix full-text and non-text queries