This page describes Vertex AI Search's basic autocomplete feature. Autocomplete generates query suggestions based on the first few characters entered for the query.
The suggestions that autocomplete generates vary depending on the type of data that the search app uses:
-
Structured and unstructured data.By default, autocomplete generates suggestions based on the content of documents in the data store. After document import, by default, autocomplete doesn't start generating suggestions until there is sufficient quality data, typically a couple of days. If you make autocomplete requests through the API, autocomplete can generate suggestions that are based on the search history or user events.
-
Website data.By default, autocomplete generates suggestions from the search history. Autocomplete requires real search traffic. After search traffic begins, autocomplete takes a day or two before generating suggestions. Suggestions can be generated from web-crawled data from public sites with the experimental advanced document data model .
-
Healthcare data.By default, a canonical medical data source is used to generate autocomplete suggestions for healthcare data stores.
The query suggestions model determines what type of data autocomplete uses to generate suggestions. There are four query suggestions models:
-
Document.The document model generates suggestions from user-imported documents. This model isn't available for website data or healthcare data.
-
Completable Fields.The completable fields model suggests text taken directly from structured data fields. Only those fields that are annotated with
completable
are used for autocomplete suggestions. This model is only available for structured data. -
Search history.The search history model generates suggestions from the history of
SearchService.search
API calls. Don't use this model if there is no traffic available for theservingConfigs.search
method. This model isn't available for healthcare data. -
User event.The user events model generates suggestions from user-imported events of type
search
. This model isn't available for healthcare data.
Autocomplete requests are sent using the dataStores.completeQuery
method.
Alternatively, if you don't want to use a query suggestions model, you can use Imported suggestionsthat provide autocomplete suggestions based on an imported list of suggestions. For more information, see Use an imported list of autocomplete suggestions .
Model types available according to data type
The following table shows the query suggestions model types available for each data type.
Query suggestions model |
Data source |
Website data |
Structured data |
Unstructured data |
---|---|---|---|---|
Document
|
Imported | ✔* (default) | ✔ (default) | |
Completable
fields
|
Imported | ✔ | ||
Search
history
|
Automatically collected | ✔ (default) | ✔ | ✔ |
User
events
|
Imported or automatically collected by widget | ✔ | ✔ | ✔ |
Web-crawled
content
|
Crawled from content from public websites that you specify | ✔ † |
* : The document schema must contain title
or description
fields, or there
must be fields that have been specified as title
or description
key
properties. See Update a schema for structured data
.
† : Web-crawled content can only be used as a data source if the experimental advanced document data model for autocomplete is enabled. See Advanced document data model .
If you don't want to use the default model for your data type, you can
specify a different model when you send your autocomplete request. Autocomplete
requests are sent using the dataStores.completeQuery
method. For
information, see API instructions: Send an autocomplete request to choose a
different model
.
Autocomplete features
Vertex AI Search supports the following autocomplete features to show the most helpful predictions during search:
Milc
→ Milk
.- Powered by Google Safe Search.
- Remove inappropriate queries.
- Supported in English (
en
), French (fr
), German (de
), Italian (it
), Polish (pl
), Portuguese (pt
), Russian (ru
), Spanish (es
), and Ukrainian (uk
).
If there is an email address jeffersonloveshiking@gmail.com
in the data store,
Vertex AI Search won't return the email address as an
autocomplete suggestion if the user types jef
in the search bar.
To more thoroughly safeguard against PII leaks, Google recommends that you apply your own Data Loss Protection (DLP) solution in addition to the detectors that are provided by Vertex AI Search. For more information, see Protect against PII leaks .
- Remove terms that are listed in the denylist.
- Powered by AI-driven semantic understanding.
- For near-identical terms, either term matches, but only the more popular one is suggested.
Shoes for Women
, Womens Shoes
, and Womans Shoes
are deduplicated,
and only the most popular one is suggested.- Not available in US and EU multi-regions.
- Optional setting.
- If there are no autocomplete matches for the entire query, suggest matches for only the trailing word of the query.
- Not available for healthcare search.
Tail match suggestions
Tail match suggestions are made using exact prefix matching against the last word in a query string.
For example, say the query "songs with he" is sent in an autocomplete request. When tail matching is enabled, autocomplete might find that the full prefix "songs with he" does not have any matches. However, the last word in the query, "he", has an exact prefix match with "hello world" and "hello kitty". In that case, the returned suggestions are "songs with hello world" and "songs with hello kitty" because there are no full match suggestions.
You can use this feature to reduce empty suggestion results and increase suggestion diversity, making this especially useful in cases where data sources (user event count, search history, and document topic coverage) is limited. However, enabling tail match suggestions can reduce the overall quality of suggestions. Because tail match only matches the trailing word of the prefix, some returned suggestions might not make sense. For example, a query such as "songs with he" might get a tail match suggestion like "songs with helpers guides".
Tail match suggestions are only returned if:
-
include_tail_suggestions
is set totrue
in thedataStores.completeQuery
request. -
There are no full prefix match suggestions for the query.
Protect against PII leaks
The definition of PII is broad, and PII can be difficult to detect. As a result, Vertex AI Search can't guarantee that PII won't be returned in autocomplete suggestions.
Vertex AI Search applies the Sensitive Data Protection inspection service to look for and block common types of PII from appearing as suggestions. However, if your data stores contain PII or if you use the search history or user events query suggestions models, review the following and take appropriate action:
-
If the types of PII that you want to protect are fairly standard, such as phone numbers and email addresses, begin by extensively testing autocomplete suggestions for your app. Vertex AI Search can't guarantee that PII won't be returned in autocomplete suggestions.
-
If PII leaks are discovered during autocomplete testing or if you already know that you have non-standard PII to protect (for example, proprietary user IDs), then try adjusting the autocomplete threshold and content serving parameters. For more information, see Reduce the risk of returning suggestions that contain PII .
-
If adjusting the parameters isn't sufficient to prevent PII leaks, then implement your own DLP solution. Customize the DLP solution for the types of PII most likely to be found in your data stores, user events, or users' search queries. You can use Sensitive Data Protection or a third-party DLP service. Take one of the following approaches:
-
Filter out PII before you import the documents and user events in your data stores.
-
Review autocomplete suggestions before presenting suggestions to the user at serving time and block the suggestions that contain any PII.
-
-
If you use the search history or user events model, add some informational text on the search bar, telling your users not to put PII in their search queries.
-
If you have questions or encounter particular challenges with blocking PII, contact your customer engineer (CE) or Google account team.
Turn autocomplete on or off for a widget
To turn autocomplete on or off for a widget, follow these steps:
Console
-
In the Google Cloud console, go to the AI Applicationspage.
-
Click the name of the app that you want to edit.
-
Click Configurations.
-
Click the UItab.
-
Toggle the Show autocomplete suggestionsoption to turn autocomplete suggestions for the widget on or off. When you enable autocomplete, expect to wait a day or two before suggestions start.
Update autocomplete settings
To configure the autocomplete settings in the UI, follow these steps:
Console
-
In the Google Cloud console, go to the AI Applicationspage.
-
Click the name of the app that you want to edit.
-
Click Configurations.
-
Click the Autocompletetab.
-
Enter or select new values for the autocomplete settings you want to update:
- Maximum number of suggestions:The maximum number of autocomplete suggestions that can be offered for a query.
- Minimum length to trigger:The minimum number of characters that can be typed before autocomplete suggestions are offered.
- Matching order: The location in a query string that autocomplete can start matching its suggestions from.
- Query suggestions model: The query suggestions model used to
generate the retrieved suggestions. This can be overridden in the
dataStores.completeQuery
using thequeryModel
parameter. -
Enable autocomplete: By default, autocomplete doesn't start making suggestions until it has sufficient quality data, typically a couple of days. If you want to override this default and start getting some autocomplete suggestions sooner, select Now.
Even when you select Now, it can take a day for suggestions to be generated and still some autocomplete suggestions will be missing or poor quality until there is sufficient good data.
-
Deny list: Import a denylist as a JSON file in a Cloud Storage bucket. For more information about the denylist constraints and specifications, see Use an autocomplete denylist .
-
Click Save and publish.Changes take effect within a few minutes for engines where autocomplete has already been turned on.
Reduce the risk of returning suggestions that contain PII
End users have all kinds of PII information, such as driver licenses and telephone numbers, which they are supposed to keep private. But, any of this PII information might be typed into the search bar by users looking for results that are specific to themselves.
If you use the search history or user events model and there is a likelihood of your users typing PII into the search bar, then you can reduce PII leaks by adjusting the following parameters:
-
queryFrequencyThreshold
: Before a query can be returned as an autocomplete suggestion, it must have been entered this many times. -
numUniqueUsersThreshold
: Before a query can be returned as an autocomplete suggestion, it must have been entered by this many unique users. The value of theuserPseudoId
field in the search user event determines if the user is unique.
Use case example
For example, take a case where users have account numbers that should be kept private.
If the search history or user events suggestion model is in use, then these
account numbers, along with all the other terms that end users search for, are
used to generate suggestions. Thus, if user-A's account number YZ-46789A
has
repeatedly been entered into the search bar and user-B has an account number of YZ-42345B
, when user-B types YZ-4
into the search bar, the autocomplete
suggestion returned might be user-A's account number.
To reduce the likelihood of this kind of leak happening, the AI Applications administrator decides to:
-
Increase the value of the
queryFrequencyThreshold
parameter to30
. In this case, it is very unlikely for one account number to be entered so often. However, popular search queries will be entered at least that often. -
Increase the value of the
numUniqueUsersThreshold
parameter to6
. The administrator thinks it unlikely for the same account number to be entered in the search bar in six search events each associated with a differentuserPseudoId
.
Procedure
There are two threshold parameters for autocomplete.
These parameters aren't available on the Google Cloud console but can be set with a
REST API call to the updateCompletionConfig
method.
To configure the autocomplete threshold settings, follow these steps. Each step is optional, depending on the parameter you want to change.
REST
-
Update the
CompletionConfig.queryFrequencyThreshold
field:curl -X PATCH \ -H "Authorization: Bearer $( gcloud auth print-access-token ) " \ -H "Content-Type: application/json" \ -H "X-Goog-User-Project: PROJECT_ID " \ https://discoveryengine.googleapis.com/v1alpha/projects/ PROJECT_ID /locations/global/collections/default_collection/dataStores/ DATA_STORE_ID /completionConfig?updateMask = queryFrequencyThreshold \ -d '{ "name": "projects/ PROJECT_ID /locations/global/collections/default_collection/dataStores/ DATA_STORE_ID /completionConfig", "queryFrequencyThreshold": QUERY_FREQUENCY_THRESHOLD }'
Replace the following:
-
PROJECT_ID
: the number or ID of your Google Cloud project. -
DATA_STORE_ID
: the ID of the data store that is associated with your app. -
QUERY_FREQUENCY_THRESHOLD
: an integer value that indicates the minimum number of times that a search query must be entered before it can be returned as an autocomplete suggestion. The count is summed over a months-long, rolling, time window. The default is8
.
-
-
Update the
CompletionConfig.numUniqueUsersThreshold
field:curl -X PATCH \ -H "Authorization: Bearer $( gcloud auth print-access-token ) " \ -H "Content-Type: application/json" \ -H "X-Goog-User-Project: PROJECT_ID " \ https://discoveryengine.googleapis.com/v1alpha/projects/ PROJECT_ID /locations/global/collections/default_collection/dataStores/ DATA_STORE_ID /completionConfig?updateMask = numUniqueUsersThreshold \ -d '{ "name": "projects/ PROJECT_ID /locations/global/collections/default_collection/dataStores/ DATA_STORE_ID /completionConfig", "numUniqueUsersThreshold": UNIQUE_USERS }'
Replace
UNIQUE_USERS
with an integer value that represents the minimum number of unique users who must enter a given search query before it can be returned as an autocomplete suggestion. The count is summed over a months-long, rolling, time window. The default is3
.
Update completable field annotations in schema
To turn on autocomplete for fields in structured data schema, follow these steps:
Console
-
In the Google Cloud console, go to the AI Applicationspage.
-
Click the name of the app that you want to edit. It must use structured data.
-
Click Data.
-
Click the Schematab.
-
Click Editto select the schema fields to mark as
completable
. -
Click Saveto save the updated field configurations. These suggestions take around a day to be generated and returned.
Send autocomplete requests
The following samples show how to send autocomplete requests.
REST
To send an autocomplete request using the API, follow these steps:
-
Find your data store ID. If you already have your data store ID, skip to the next step.
-
In the Google Cloud console, go to the AI Applicationspage and in the navigation menu, click Data Stores.
-
Click the name of your data store.
-
On the Datapage for your data store, get the data store ID.
-
-
Call the
dataStores.completeQuery
method.curl -H "Authorization: Bearer $( gcloud auth print-access-token ) " \ "https://discoveryengine.googleapis.com/v1/projects/ PROJECT_ID /locations/global/collections/default_collection/dataStores/ DATA_STORE_ID :completeQuery?query= QUERY_STRING "
Replace the following:
-
PROJECT_ID
: the number or ID of your Google Cloud project. -
DATA_STORE_ID
: the ID of the data store that is associated with your app. -
QUERY_STRING
: the typeahead input used to fetch suggestions.
-
Send an autocomplete request to a different model
To send an autocomplete request with a different query suggestions model, follow these steps:
-
Find your data store ID. If you already have your data store ID, skip to the next step.
-
In the Google Cloud console, go to the AI Applicationspage and in the navigation menu, click Data Stores.
-
Click the name of your data store.
-
On the Datapage for your data store, get the data store ID.
-
-
Call the
dataStores.completeQuery
method.curl -H "Authorization: Bearer $( gcloud auth print-access-token ) " \ "https://discoveryengine.googleapis.com/v1/projects/ PROJECT_ID /locations/global/collections/default_collection/dataStores/ DATA_STORE_ID :completeQuery?query= QUERY_STRING &query_model= QUERY_SUGGESTIONS_MODEL "
Replace the following:
-
PROJECT_ID
: the number or ID of your Google Cloud project. -
DATA_STORE_ID
: the unique ID of the data store that is associated with your app. -
QUERY_STRING
: the typeahead input used to fetch suggestions. -
AUTOCOMPLETE_MODEL
: the autocomplete data -
QUERY_SUGGESTIONS_MODEL
: the query suggestions model to use for the request:document
,document-completable
,search-history
, oruser-event
. For healthcare data, usehealthcare-default
.
-