This page shows you how to configure the schema fields to set up an app for structured data, for unstructured data with metadata, or for website data with custom, structured attributes.
Field settings help determine how Agent Search uses fields in its results. You can use the Schematab in the Google Cloud console to configure field settings.
Configuring field settings is available only for apps with data stores containing either structured data or unstructured data with metadata.
Field settings
The following field settings are available for many field types in your search or recommendations data, but not for all data types. A schema contains multiple field settings for individual fields, and the next table contains settings which can be applied to a field within a schema. Using structured data is highly recommended for these field settings:
Setting fields to indexable allows for operations like filtering, boosting, and faceting on structured fields within a document.
Fields of type Object
can't be
set to Indexable
.
Marking a field as Indexable
allows quicker lookups.
Note that marking a field as Indexable
increases the size of
the search index and can slow down indexing.
hotel_chain
, as indexable. This lets you apply ranking,
filtering, and boosting operations on hotel_chain
. For example,
you can apply a filter so that the search will turn up only search results
containing the filtered hotel chain.Fields that are most likely to
be related to searches are designated as Searchable
. A field
can be searchable without being indexable or retrievable.
Only fields with text values can be marked searchable. Thus, a numeric price field can be indexable (for filtering or faceting) but can't be searchable as full text.
Setting a field to Searchable improves recall for that field in search queries, allowing users to find content, such as web pages, by querying the text within these fields. Marking a field as searchable allows ranking to be applied. Consequently, marking an excessive number of fields as searchable can negatively affect search precision by oversaturating the ranking algorithm and returning too many results. This can lead to irrelevant search returns.
You can apply a relative weighting to searchable fields; however, because of robust defaults, this is rarely necessary. See Weight searchable fields below.
An internet service provider's support ticket system stores each ticket as a structured
document. If these documents contain Searchable text fields, such as issue_description
or resolution_notes
, a support
agent can perform a query related to the content in those fields, such as how to fix slow internet speeds after modem reset.
The system would
then surface the documents that contain any of those search terms, including modem
, internet
, speed
, in either one or both issue_description
or resolution_notes
fields.
(Preview)
Allows prefix matching text
fields using the STARTS_WITH
operator in filter expressions.
Only fields of type String
or String Array
can be
set to Prefix matchable.
For more information, see Make fields available for prefix and partial matching below.
Setting a field to prefix matchable enables the search engine to match query strings that are prefixes of the field's value. This is particularly useful for matching hierarchical identifiers, paths, or codes where the beginning of the string is known. Prefix matching is limited to the first 12 characters of the normalized field value and increases the size of the search index. You can't set more than 10 fields as prefix matchable.
You have a field, ticket_id
, that uses a format like <country-code><city-code><number>
. Examples
include UKLON100
, UKMAN100
, UKMAN101
,
and USNY200
. To find all tickets from Manchester (UK), you can
set the ticket_id
field as prefix matchable and then use the
filter ticket_id: STARTS_WITH("UKMAN")
, which returns UKMAN100
and UKMAN101
.
(Preview)
Allows partial string matching on text fields using
the CONTAINS
operator in filter expressions.
Only fields of type String
or String Array
can be
set to partially matchable.
For more information, see Make fields available for prefix and partial matching below.
Setting a field to partially matchable enables token-based matching within a field, allowing users to find content when only a part of the field value is known. The search engine matches query tokens against tokens in the field value, regardless of their order. Note that marking a field as partially matchable increases the size of the search index. You can't set more than 10 fields as partially matchable.
You want to filter for regions in Europe. The region
names include Central Europe
and Eastern Europe
.
If your filter is region: ANY("Europe")
, you won't get
any matches. However, if the region
field is set as partially
matchable, you can filter with region: CONTAINS("Europe")
and
get matches for Central Europe
and Eastern Europe
.
Dynamic Facetable
allows the system to automatically
generate interactive filters (facets) based on the unique values present in
the field.Dynamic
facetable
enables users to dynamically refine search results by
selecting categories or attributes directly derived from your ingested data,
without having to manually pre-define every possible filter option. This
allows the user to narrow down their search to highly specific
web content.Use Dynamic Facetablewith Searchableto achieve better results, which improves both the recall of your search and the quality of the facets offered to the user.
department
, document_type
, or last_modified_date
. If
these fields are tagged as dynamic facetable
, an
employee search for a term like expense reimbursement
dynamically
generates interactive filters based on the relevant results
found. In such case, the web interface could display facets for Department: Finance, Travel, Document Type: Policy,
FAQ, or Last Modified Date: This Quarter, Last
Year.product_id
, name
, price
, and an image_url
are typical fields that you want to set as
retrievable. On the other hand, the internal_tracking_code
can
be indexed and filterable for administrative purposes only, but not
retrievable in public search results.This setting enables values within that field to be used for providing real-time query suggestions as users type. This feature helps guide your users to relevant content and accelerates the search process. Certain factors such as use of natural language filtering can impact this performance.
completable
field is set for product_name
, brand
, and category
, when the user types Tech,
the autocomplete suggestions can show: - TechCo(from the
brandfield) - TechCoUltraBook X1 (from the
product_namefield) - TechnologyGameMaster Pro (another product
from the
categoryfield)
Filterable
helps customize recommendations for users. Note that filtering limits
apply.language_code: ANY("en", "fr") OR categories: ANY("drama")
.Differences between commonly used settings
There are key differences between the indexable, searchable, and retrievable field settings. The table summarizes these differences.
| Feature | Indexable | Searchable | Retrievable |
|---|---|---|---|
|
Primary Role
|
Makes field content available to search engine | Allows full-text querying against field content | Allows field's value to be returned in search results |
|
Analysis
|
Content is processed and put into an index. | Typically undergoes extensive lexical analysis. | Value is stored as-is for display. |
|
Can it be...
|
|||
|
...Searchable?
|
Yes (often a prerequisite) | N/A | Not necessarily (can be retrievable without being searchable) |
|
...Retrievable?
|
Not necessarily | Not necessarily | N/A |
|
...Filterable/Sortable/Facetable?
|
Yes (generally a prerequisite for these too) | Not directly; these are separate attributes often built on an indexable field. | Not directly; these attributes relate to how the field is indexed and queried, not just displayed. |
In practice, many fields that are crucial for user experience (such as titles,
descriptions, and identifying information) are often set to be indexable
, searchable
, and retrievable
.
Limitations
Field settings have the following limitations:
- You can configure up to 50 fields as indexable, searchable, retrievable, or dynamic facetable.
- To configure a field as dynamic facetable, it must first be configured as indexable.
- Changing the indexable setting requires re-indexing the data, which can take hours, especially for large data stores.
If you are configuring fields for a media search app and want detailed information about the fields in the schema, see About media documents and data stores .
Update field settings
To update field settings:
-
In the Google Cloud console, go to the AI Applicationspage.
-
Click the name of the app that you want to edit.
-
Click Data.
-
Click the Schematab. This tab shows current field settings.
You won't see the Schematab if your data store contains basic website data or unstructured data without metadata .
-
Click Edit.
-
Select or clear field settings that you need to update. Some field settings are not supported. For example, numerical fields cannot be set to Searchable.
-
Click Saveto apply your changes.
Weight searchable fields (Preview)
If you mark a field searchable, you can specify a weight to indicate its relative importance in search results. Most situations don't require you to specify weights for individual fields because the default weights work well.
However, adjusting weights can be necessary in a few situations, for example:
-
You're migrating data from an existing search platform that already uses weighted fields.
-
When default weights aren't providing satisfactory search results. Specifically, this can happen when you have many searchable fields and some are markedly more important than others.
Perhaps, the summary is the most important field for searches and so you want to prioritize that text.
Or, the schema has a field containing highly relevant keywords that are excellent predictors for search results, but, because this field is much shorter than others, its influence is often overshadowed by longer fields. Increasing its weight ensures it has the intended impact.
Weight levels
Weights are banded into the following levels:
| Field importance | Explanation |
|---|---|
| Very low | A low value that the system still considers when it combines scores from all fields. If you want even less weight so that the effect is negligible, don't mark the field searchable. |
| Low | A weight that is lower than the default. |
| Default | The standard weight for searchable fields. This weight provides reasonably good performance for most cases. |
| High | A weight that is noticeably higher than the default. |
| Very high | A dominating weight. Typically, you reserve this for, at most, one field. |
Schema update and reindexing
Adding weights to searchable fields requires a schema update and subsequent re-indexing of the data in the data store. Updating the schema takes hours, and there isn't a reliable indicator to tell you when indexing is completed, so you need to overestimate the indexing time.
Set weight levels on fields
The task of setting weight levels for fields can be tedious because you should make only small changes and carefully review search results afterward to check for unintended consequences. After each change, you must wait for re-indexing to complete before you can evaluate the impact of the change.
You can configure search field weighting only through the API. This feature is not available in the Google Cloud console.
To set weights, you need to update the schema for the data store through the API projects.locations.dataStores.schemas.patch
method.
-
If you don't have your schema already, follow the instructions to get your schema in View a schema definition .
-
Follow the instructions to update the schema programmatically . Add weights to one or more searchable fields, as in these examples:
"summary": { "type": "string", "searchable": true, "weight": "high" }, "uri": { "type": "string", "searchable": true, "weight": "low" },In this example, the
summaryfield is set to a higher weight than normal and theurifield to a lower weight. If you want to return a weight to the default value, set it todefault.Allowed values for the weight parameter are:
-
very_low -
low -
default -
high -
very_high
-
-
Wait for reindexing to complete and test the search behavior.
Make fields available for prefix and partial matching (Preview)
For fields of type string
, you can edit the schema to make the fields
available for prefix matching or partial matching. This lets you use STARTS_WITH
or CONTAINS
in filter expressions.
Schema update and reindexing
Making fields available for prefix or partial matching requires a schema update and subsequent re-indexing of the data in the data store. Updating the schema takes hours, and there isn't a reliable indicator to tell you when indexing is completed, so you need to overestimate the indexing time.
Update schema for prefix and partial matching
To specify fields as available for prefix matching or partial matching, you need
to update the schema for the data store through the API projects.locations.dataStores.schemas.patch
method.
-
If you don't have your schema already, follow the instructions to get your schema in View a schema definition .
-
Follow the instructions to update the schema programmatically . Set the matchable parameters to
truein the schema, as in these examples:"zone": { "type": "string", "searchable": true, "prefixMatchable": true }, "region": { "type": "string", "searchable": true, "partialMatchable": true }, "model": { "type": "string", "searchable": true, "prefixMatchable": true, "partialMatchable": true },In this example, the
zonefield is set to be prefix matchable. Up to 12 characters, case insensitive, can be used in the filter expression to match the field value. Theregionfield is set to be partially matchable. -
Wait for reindexing to complete.
Details about prefix matching
Prefix matching in Agent Search lets you filter results based on
whether a field's value starts with a specific string. This capability is
powered by the STARTS_WITH
operator and requires that the target text field be
configured as prefixMatchable
in the schema.
Normalization and matching logic
To optimize performance and ensure consistency, the system applies a specific normalization process to both the field value(during indexing) and the query string(during search).
-
Lowercasing: All characters are converted to lowercase. This makes the matching case-insensitive.
-
12-character truncation: The system only considers the first 12 charactersof the string. Any characters beyond this limit are ignored for prefix matching purposes.
How it works
At indexing time, for a field marked as prefixMatchable
, the system generates
prefix tokens for the first 12 characters. For a value like asia-south1-c
, the
index stores tokens for a
, as
, asi
, asia
, asia-
, and so on, up to the
12th character ( asia-south1-
).
At query time, the string provided to the STARTS_WITH
operator is also
lowercased and truncated to 12 characters. The query is then matched against the
stored prefix tokens.
Examples
The following examples demonstrate how the 12-character normalization affects search results:
-
Broad matches:
STARTS_WITH("A")matches any value starting with "a" (case-insensitive), such asasia,australia, orafrica. -
Partial prefixes:
STARTS_WITH("asia-south")matches bothasia-south1-aandasia-southeast1-bbecause both start with the specified 10-character string. -
Truncation behavior: Because only the first 12 characters are matched,
STARTS_WITH("asia-south1-a")matches a field value ofasia-south1-c. This occurs because both strings are normalized to the same 12-character prefix:asia-south1-.
Details about partial matching
Partial matching in Agent Search lets you filter results based on
whether a field's value contains specific words or tokens. This capability
is powered by the CONTAINS
operator and requires that the target text field be
configured as partialMatchable
in the schema.
Normalization and tokenization logic
The Agent Search normalizes and tokenizes both the field value(during indexing) and the query string(during search):
-
Lowercasing: Agent Search converts the characters to lowercase. This makes the matching case-insensitive.
-
Tokenization: Agent Search splits the string into individual tokens (words) using spaces and other characters as delimiters.
-
Standard delimiters: Spaces and common punctuation act as delimiters; these include hyphens (
-), slashes (/), commas (,), periods (.), asterisks (*), curly brackets ({ }), square brackets ([ ]), round brackets (( )), and apostrophes ('). -
Non-delimiters: Ampersand (
&) and underscore (_) don't act as delimiters. The system treats characters joined by these symbols as a single token. -
Email delimiter (
@): The@symbol acts as a special delimiter to assist in recognizing email addresses. The system generates tokens for the individual components as well as combined forms. For example, it tokenizessupport+tier1@example.comtosupport,tier1,example,com,support+tier1@example.com, andsupport@example.com.
-
How it works
At indexing time, for a field marked as partialMatchable
, the system
normalizes and tokenizes the field value and stores the resulting tokens in the
index. For example, it tokenizes a field value of 25-meter, outdoor, swimming
pool
to [25, meter, outdoor, swimming, pool]
.
At query time, the string provided to the CONTAINS
operator undergoes the same
normalization and tokenization process. The search engine then verifies that
all resulting query tokens match the stored tokens for the field. The order
of the tokens has no effect.
Examples
The following examples demonstrate how tokenization affects partial matching results:
-
Basic token matching: If the field value is
25-meter, outdoor, swimming pool, a filter ofCONTAINS("Outdoor pool")matches. The system tokenizes the query to[outdoor, pool], both of which exist in the field's tokens. -
Order independence: A filter of
CONTAINS("pool outdoor")also matches the field value25-meter, outdoor, swimming poolbecause the system checks for the presence of each token independently of their order. -
Email address matching: If a field stores the email
support+tier1@example.com, filters likeCONTAINS("support"),CONTAINS("example.com"), orCONTAINS("support@example.com")all successfully match due to the special tokenization of the@symbol.

