This page describes how to perform a search that mixes full-text and non-text data.
Perform a mixed full-text and non-text search
Search indexes
support
full-text, exact match, numeric columns, and JSON/JSONB columns. You can combine
text and non-text conditions in the WHERE
clause similarly to multi-column
search queries. The query optimizer tries to optimize non-text predicates with a
search index. If that's not possible, Spanner evaluates the
condition for every row that matches the search index. Referenced columns not
stored in the search index are fetched from the base table.
Consider the following example:
GoogleSQL
CREATE
TABLE
Albums
(
AlbumId
STRING
(
MAX
)
NOT
NULL
,
Title
STRING
(
MAX
),
Rating
FLOAT64
,
Genres
ARRAY<STRING
(
MAX
)>,
Likes
INT64
,
Cover
BYTES
(
MAX
),
Title_Tokens
TOKENLIST
AS
(
TOKENIZE_FULLTEXT
(
Title
))
HIDDEN
,
Rating_Tokens
TOKENLIST
AS
(
TOKENIZE_NUMBER
(
Rating
))
HIDDEN
,
Genres_Tokens
TOKENLIST
AS
(
TOKEN
(
Genres
))
HIDDEN
)
PRIMARY
KEY
(
AlbumId
);
CREATE
SEARCH
INDEX
AlbumsIndex
ON
Albums
(
Title_Tokens
,
Rating_Tokens
,
Genres_Tokens
)
STORING
(
Likes
);
PostgreSQL
Spanner PostgreSQL support has the following limitations:
-
spanner.tokenize_numberfunction only supports thebiginttype. -
spanner.tokendoesn't support tokenizing arrays.
CREATE
TABLE
albums
(
albumid
character
varying
NOT
NULL
,
title
character
varying
,
rating
bigint
,
genres
character
varying
NOT
NULL
,
likes
bigint
,
cover
bytea
,
title_tokens
spanner
.
tokenlist
AS
(
spanner
.
tokenize_fulltext
(
title
))
VIRTUAL
HIDDEN
,
rating_tokens
spanner
.
tokenlist
AS
(
spanner
.
tokenize_number
(
rating
))
VIRTUAL
HIDDEN
,
genres_tokens
spanner
.
tokenlist
AS
(
spanner
.
token
(
genres
))
VIRTUAL
HIDDEN
,
PRIMARY
KEY
(
albumid
));
CREATE
SEARCH
INDEX
albumsindex
ON
albums
(
title_tokens
,
rating_tokens
,
genres_tokens
)
INCLUDE
(
likes
);
The behavior of queries on this table include the following:
-
RatingandGenresare included in the search index. Spanner accelerates conditions using search index posting lists.ARRAY_INCLUDES_ANY,ARRAY_INCLUDES_ALLare GoogleSQL functions and are not supported for PostgreSQL dialect.SELECT Album FROM Albums WHERE Rating > 4 AND ARRAY_INCLUDES_ANY ( Genres , [ 'jazz' ]) -
The query can combine conjunctions, disjunctions, and negations in any way, including mixing full-text and non-text predicates. This query is fully accelerated by the search index.
SELECT Album FROM Albums WHERE ( SEARCH ( Title_Tokens , 'car' ) OR Rating > 4 ) AND NOT ARRAY_INCLUDES_ANY ( Genres , [ 'jazz' ]) -
Likesis stored in the index, but the schema doesn't request Spanner to build a token index for its possible values. Therefore, the full-text predicate onTitleand non-text predicate onRatingis accelerated, but the predicate onLikesisn't. In Spanner, the query fetches all documents with the term "car" in theTitleand a rating more than 4, then it filters documents that don't have at least 1000 likes. This query uses a lot of resources if almost all albums have the term "car" in their title and almost all of them have a rating of 5, but few albums have 1000 likes. In such cases, indexingLikessimilarly toRatingsaves resources.GoogleSQL
SELECT Album FROM Albums WHERE SEARCH ( Title_Tokens , 'car' ) AND Rating > 4 AND Likes > = 1000PostgreSQL
SELECT album FROM albums WHERE spanner . search ( title_tokens , 'car' ) AND rating > 4 AND likes > = 1000 -
Coverisn't stored in the index. The following query does a back join betweenAlbumsIndexandAlbumsto fetchCoverfor all matching albums.GoogleSQL
SELECT AlbumId , Cover FROM Albums WHERE SEARCH ( Title_Tokens , 'car' ) AND Rating > 4PostgreSQL
SELECT albumid , cover FROM albums WHERE spanner . search ( title_tokens , 'car' ) AND rating > 4
What's next
- Learn about full-text search queries .
- Learn how to rank search results .
- Learn how to perform a substring search .
- Learn how to paginate search results .
- Learn how to search multiple columns .

