This document describes the syntax for Data Catalog search queries. Before you read this document, it is important that you understand the concepts of Data Catalog such as data entry, tags and tag templates, and other kinds of metadata. See What is Data Catalog? .
To launch a Data Catalog search query in the Google Cloud console, go to the Data Catalog Searchpage and select Data Catalogas the search mode.
Simple search
In its simplest form, a Data Catalog search query comprises a single predicate. Such a predicate can match several pieces of metadata:
- A substring of a name, display name, or description of a data asset
- Exact type of a data asset
- A substring of a column name (or nested column name) in the schema of a data asset
- A substring of a project ID
- The value of a public tag, the name of a public tag template, or a field name in a public tag template attached to a data entry.
- (Preview) A string for an email address or name for a data steward
- (Preview) A string from an overview description
The simple search does not support tag template fields of type datetime
.
For example, the predicate foo
matches the following entities:
- Data asset with the
foo.barname - Data asset with the
Foo Bardisplay name - Data asset with the description
This is the foo script. - Data asset with the exact
footype - Column
foo_barin the schema of a data asset - Nested column
foo_barin the schema of a data asset - Project
prod-foo-bar - Public tag template named
foo, data entries tagged with thefootag template, tag template display name offoo, tag template field name offoo, and tag field value offooin a string, enum, or rich text. - (Preview) Data asset with a data steward called
foo. - (Preview) Data asset with an overview containing the word
foo.
To know more about the roles and permissions to view public and private tags, see Roles to view public and private tags .
Qualified predicates
You can qualify a predicate by prefixing it with a key that restricts the matching to a specific piece of metadata.
An equal sign ( =
) restricts the search to an exact match.
A colon ( :
) after the key matches the predicate to either a substring or token within
the value in search results.
Tokenization breaks the stream of text up into a series of tokens, with each token usually corresponding to a single word.
For example:
-
name:fooselects entities with names that contain thefoosubstring:foo1andbarfoo. -
description:fooselects entities with thefootoken in the description:bar and foo. -
location=foomatches all data assets in a specified location withfooas the location name.
Data Catalog supports the following qualifiers:
name:x
x
as a substring of the data asset ID.displayname:x
x
as a substring of the data asset display name.column:x
x
as a substring of the column name (or nested column name) in the schema of the data asset.You can search for a nested column by its path using the AND logical operator .
For example,
column:(foo bar)
matches a nested column with the foo.bar
path.description:x
x
as a token in the data asset description.label:bar
bar
as a substring.label=bar
bar
as a string.label:bar:x
x
as a substring in the value of a label with key bar
attached to a BigQuery data asset.label=foo:bar
foo
and the key value equals bar
.label.foo=bar
foo
and the key value equals bar
.label.foo
foo
as a string.type=<type>
<type>.<sub-type>.
Types and subtypes include:
-
type=tablematches all tables, views, and materialized views. -
type=datasetmatches all datasets. -
type=table.viewortype=viewmatches all views but not materialized views. -
type=materialized_viewmatches all materialized views. -
type=lakematches all lakes. -
type=zonematches all zones. -
type=tag_templatematches all tag templates. -
type=entry_groupmatches all entry groups. -
type=data_streammatches all Pub/Sub topics. - (Preview)
type=dataset.linkedmatches all BigQuery sharing (formerly Analytics Hub) linked datasets.
projectid:bar
bar
as a substring in the ID.parent:x
x
as a substring of the hierarchical path of a BigQuery data asset. The path has the format <project_id>.<dataset_name>
.For example,
parent:foo.bar
matches all tables and views of a dataset with the path project-foo.bar-dataset
.orgid=number
number
.system=<system>
Systems include:
-
system=bigquerymatches all data assets from BigQuery. -
system=cloud_bigtablematches all data assets from Bigtable. -
system=cloud_pubsubmatches all data assets from Pub/Sub. -
system=cloud_spannermatches all data assets from Spanner. -
system=dataproc_metastorematches all data assets from Dataproc Metastore. -
system=data_catalogmatches all data assets created in Data Catalog. -
system=dataplexmatches all data assets created in Dataplex Universal Catalog.
location=<location>
location=us-central1
matches all assets hosted in Iowa.For a full list of supported locations, see Data Catalog regions .
cluster_location=<location>
For example,
cluster_location=us-central1
matches all assets hosted in Iowa.For a full list of supported locations, see Bigtable regions .
tag:x
x
matches any substring in < tag_template_project_id
>.< tag_template_id
>.< tag_field_id
> of a private or public tag.Examples:
-
tag:data_ownermatches data assets that have thedata_ownertag. -
tag:data_gov_templatematches data assets that have been tagged with thedata_gov_templatetag template. -
tag:mycloudproject.data_gov_templatematches data assets tagged with thedata_gov_templatetemplate in themycloudprojectproject.
tag:key<operator>val
key
to any substring of the tag field ID, tag template ID, or Google Cloud project ID of a tag template. Then, matches val
to the tag value of the key
depending on the tag field type.The type-dependent
<operator>
sets permitted for tag values are: - string/richtext
: ":"
Note: The colon in this string search denotes an exact token match, not a substring. - boolean and enum : "="
- double : "=", "<", ">", "<=", ">="
- timestamp : ":", "=", "<", ">", "<=", ">="
- string
:
tag:data_owner:@mail.commatches data assets that have@mail.comvalues. - boolean
:
tag:data_gov_template.hasPII=truematcheshasPIIboolean tags in thedata_gov_templatethat aretrue. - enum
:
tag:certification_level_1=HIGHEST. - double
:
tag:datascore=9matches data assets withdatascoredouble tags that have value9. - timestamp
:
tag:expiredDate:2019-01-01matches data assets that have anexpiredDatetag of2019-01-01. - timestamp
:
tag:expiredDate<2019-02matches data assets that have anexpiredDatetag prior to2019-02-01T00:00:00.
createtime
Examples:
-
createtime:2019-01-01matches data assets created on2019-01-01. -
createtime<2019-02matches data assets created prior to2019-02-01T00:00:00. -
createtime>2019-02matches data assets created after2019-02-01T00:00:00.
updatetime
Examples:
-
updatetime:2019-01-01matches data assets updated on2019-01-01. -
updatetime<2019-02matches data assets updated prior to2019-02-01T00:00:00. -
updatetime>2019-02matches data assets updated after2019-02-01T00:00:00.
policytag:x
x
as a substring of the policy tag
display name. Finds all assets using matching policy tag or its descendants.policytagid=x
x
as a policy tag or taxonomy
ID. Finds all assets using matching policy tag or its descendants.term:x
x
.fully_qualified_name:x
x
as a substring of fully_qualified_name
.fully_qualified_name=x
x
as fully_qualified_name
.Logical operators
A query can be comprised of several
predicates with logical operators. If you don't specify an operator, logical AND
is implied. For example, foo bar
returns entities that match both
predicate foo
and predicate bar
.
Logical AND and logical OR are supported, for example, foo OR bar
.
You can negate a predicate with a -
or NOT
prefix. For example, -name:foo
returns
all entities with names that don't match the predicate foo
.
Abbreviated syntax
An abbreviated search syntax is also available, using |
for OR
operators and ,
for AND
operators.
For example, to search for entries inside one of many projects using the OR
operator, you can use:
projectid:(pid1|pid2|pid3|pid4)
Instead of:
projectid:pid1 OR projectid:pid2 OR projectid:pid3 OR projectid:pid4
To search for entries with matching column names:
- AND:
column:(name1, name2, name3) - OR:
column:(name1|name2|name3)
This abbreviated syntax works for the qualified
predicates
listed earlier, except for tag
, term
, policytag
, policytagid
and label
.

