This document describes the syntax for Dataplex Universal Catalog search queries. Before you read this document, it is important that you understand concepts for metadata management in Dataplex Universal Catalog, such as entries, aspects, aspect types, entry groups, and entry types. For more information, see About metadata management in Dataplex Universal Catalog .
Dataplex Universal Catalog offers two search modes: keyword search and semantic search (Preview).
Keyword search lets you find resources using specific keywords, filters, and a defined syntax.
Semantic search extends keyword search to support natural language queries. It lets you find resources using everyday language, eliminating the need for complex syntax.
This document covers syntax for both keyword and semantic search.
To launch a Dataplex Universal Catalog search query in the Google Cloud console, go to the Dataplex Universal Catalog Searchpage and select Dataplex Universal Catalogas the search platform.
For more information, see Search for resources in Dataplex Universal Catalog .
Free-text search
You can find assets by entering a term or phrase without any specific syntax. Dataplex Universal Catalog performs a broad search by matching your query against several metadata fields, including the following:
- Name, display name, or description of a resource
- Type of a resource
- Project ID
- Overview description
- Column name (or nested column name) in the schema of a resource
- Column description
- Fully qualified name
- Contacts
- Aspects
Search with query syntax
For more precise searches, you can construct a query using specific syntax, including qualifiers, logical operators, and aspect searches.
Qualified predicates
You can qualify a predicate by prefixing it with a key that restricts the matching to a specific piece of metadata:
- An equal sign (
=
) restricts the search to an exact match. - A colon (
:
) after the key matches the predicate to either a substring or a token within the value in the search results.
Tokenization splits the stream of text into a series of tokens, with each token usually corresponding to a single word.
For example:
-
name:foo
selects resources with names that contain thefoo
substring, likefoo1
andbarfoo
. -
description:foo
selects resources with thefoo
token in the description, likebar
andfoo
. -
location=foo
matches resources in a specified location withfoo
as the location name.
The behavior of these qualifiers can vary slightly between search modes, as detailed in the following sections.
Keyword search qualifiers
The predicate keys type
, system
, location
, and orgid
support only the
exact match ( =
) qualifier, not the substring qualifier ( :
). For example, type=foo
or orgid=number
.
Dataplex Universal Catalog supports the following qualifiers for keyword search:
name:x
x
as a substring of the resource ID.displayname:x
x
as a substring of the resource display name.column:x
x
as a substring of the column name (or nested
column name) in the schema of the resource.description:x
x
as a token in the resource description.label:bar
bar
as a substring.label=bar
bar
as a string.label:bar:x
x
as a substring in the value of a label with
key bar
attached to a BigQuery resource.label=foo:bar
foo
and the key value equals bar
.label.foo=bar
foo
and the key value equals bar
.label.foo
foo
as a string.type= TYPE
projectid:bar
bar
as a substring in the ID.parent:x
x
as a substring of the hierarchical path of a
resource. The parent path is a fully_qualified_name
of
the parent resource.orgid=number
number
.system= SYSTEM
location= LOCATION
Matches resources in a specified location with an exact name.
For example, location=us-central1
matches assets hosted
in Iowa.
BigQuery Omni assets support this qualifier by using the BigQuery Omni location name
.
For example, location=aws-us-east-1
matches BigQuery Omni
assets in Northern Virginia.
createtime
Finds resources that were created within, before, or after a given date or time.
For example:
-
createtime:2019-01-01
matches resources created on 2019-01-01. -
createtime<2019-02
matches resources created before 2019-02-01T00:00:00. -
createtime>2019-02
matches resources created after 2019-02-01T00:00:00.
Timestamp format: YYYY-MM-DDThh:mm:ss
All timestamps must be in GMT; time zones are not supported. Partial
timestamps, hyphen ( -
) date separators, and slash
( /
) date separators are supported.
For example:
-
2010-10-22T05:36:24
-
2010-10-22T05:36
-
2010-10-22T05
-
2010-10-22
-
2010-10
-
2010
-
2010/10/22
updatetime
Finds resources that were updated within, before, or after a given date or time.
For example:
-
updatetime:2019-01-01
matches resources updated on 2019-01-01. -
updatetime<2019-02
matches resources updated before 2019-02-01T00:00:00. -
updatetime>2019-02
matches resources updated after 2019-02-01T00:00:00.
Timestamp format: YYYY-MM-DDThh:mm:ss
All timestamps must be in GMT; time zones are not supported. Partial
timestamps, hyphen ( -
) date separators, and slash
( /
) date separators are supported.
For example:
-
2010-10-22T05:36:24
-
2010-10-22T05:36
-
2010-10-22T05
-
2010-10-22
-
2010-10
-
2010
-
2010/10/22
fully_qualified_name:x
x
as a substring of fully_qualified_name
.fully_qualified_name=x
x
as fully_qualified_name
.Semantic search qualifiers
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .
The predicate keys type
, system
, location
, and description
, and aspect
search (excluding has
) support only the exact match ( =
) qualifier, not the
substring qualifier ( :
). For example, type=foo
.
Dataplex Universal Catalog supports the following qualifiers for semantic search:
name:x
x
as a substring of the resource ID or resource
display name.displayname:x
x
as a substring of the resource display name.column:x
x
as a substring of the column name (or nested
column name) in the schema of the resource.description:x
x
as a token in the resource description.labels:bar
bar
as a substring.labels=bar
bar
as a string.labels.bar:x
x
as a substring in the value of a label with
key bar
attached to a BigQuery resource.labels.foo=bar
foo
and the key value equals bar
.type= TYPE
projectid:bar
bar
as a substring in the ID.parent:x
x
as a substring of the hierarchical path of a
resource.system= SYSTEM
location= LOCATION
Matches resources in a specified location with an exact name.
For example, location=us-central1
matches assets hosted
in Iowa.
BigQuery Omni assets support this qualifier by using the BigQuery Omni location name
.
For example, location=aws-us-east-1
matches BigQuery Omni
assets in Northern Virginia.
createtime
Finds resources that were created within, before, or after a given date or time.
For example:
-
createtime:2019-01-01
matches all resources created on 2019-01-01. -
createtime<2019-02
matches all resources created before 2019-02-01T00:00:00. -
createtime>2019-02
matches all resources created after 2019-02-01T00:00:00. -
createtime>-30d
matches all resources created in the last 30 days. -
createtime<=-30d
matches all resources created 30 days ago or earlier. -
createtime<=-1d
matches all resources created on the previous day.
Timestamp format: YYYY-MM-DDThh:mm:ss
All timestamps must be in GMT; time zones are not supported. Partial
timestamps, hyphen ( -
) date separators, and slash
( /
) date separators are supported.
For example:
-
2010-10-22T05:36:24
-
2010-10-22T05:36
-
2010-10-22T05
-
2010-10-22
-
2010-10
-
2010
-
2010/10/22
updatetime
Finds resources that were updated within, before, or after a given date or time.
For example:
-
updatetime:2019-01-01
matches all resources updated on 2019-01-01. -
updatetime<2019-02
matches all resources updated before 2019-02-01T00:00:00. -
updatetime>2019-02
matches all resources updated after 2019-02-01T00:00:00. -
updatetime>-30d
matches all resources updated in the last 30 days. -
updatetime<-30d
matches all resources updated 30 days ago or earlier. -
updatetime=-1d
matches all resources updated on the previous day. -
updatetime>=-30d
matches all resources updated in the last 30 days. -
updatetime<=-30d
matches all resources updated 30 days ago or earlier.
Timestamp format: YYYY-MM-DDThh:mm:ss
All timestamps must be in GMT; time zones are not supported. Partial
timestamps, hyphen ( -
) date separators, and slash
( /
) date separators are supported.
For example:
-
2010-10-22T05:36:24
-
2010-10-22T05:36
-
2010-10-22T05
-
2010-10-22
-
2010-10
-
2010
-
2010/10/22
Aspect search
To search for entries based on their attached aspects, use the following query syntax.
Keyword search
aspect:x
x
as a substring of the full path to the aspect
type of an aspect that is attached to the entry, in the format projectid.location. ASPECT_TYPE_ID
aspect=x
x
as the full path to the aspect
type of an aspect that is attached to the entry, in the format projectid.location. ASPECT_TYPE_ID
aspect:x OPERATOR
value
Searches for aspect field values. Matches x
as a substring
of the full path to the aspect type and field name of an aspect that is
attached to the entry, in the format projectid.location. ASPECT_TYPE_ID
. FIELD_NAME
The list of supported operators depends on the type of field in the aspect, as follows:
- String
:
=
(exact match) and:
(substring) - All number types
:
=
,:
,<
,>
,<=
,>=
,=>
,=<
- Enum
:
=
- Datetime : same as for numbers, but the values to compare are treated as datetimes instead of numbers
- Boolean
:
=
Only top-level fields of the aspect are searchable.
For example, all of the following queries match entries where the value
of the is-enrolled
field in the employee-info
aspect is true
. Other entries that match on the substring
are also returned.
-
aspect:example-project.us-central1.employee-info.is-enrolled=true
-
aspect:example-project.us-central1.employee=true
-
aspect:employee=true
Semantic search
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .
has:x
x
as a substring of the full path to the aspect
type of an aspect that is attached to the entry, in the format projectid.location. ASPECT_TYPE_ID
has=x
x
as the full path to the aspect
type of an aspect that is attached to the entry, in the format projectid.location. ASPECT_TYPE_ID
has:x OPERATOR
value
Searches for aspect field values. Matches x
as a substring
of the full path to the aspect type and field name of an aspect that is
attached to the entry, in the following formats:
-
Syntax for system aspect types:
-
ASPECT_TYPE_ID . FIELD_NAME
-
dataplex-types. ASPECT_TYPE_ID . FIELD_NAME
-
dataplex-types.LOCATION. ASPECT_TYPE_ID . FIELD_NAME
For example, the following queries match entries where the value of the
type
field in thebigquery-dataset
aspect isdefault
:-
bigquery-dataset.type=default
-
dataplex-types.bigquery-dataset.type=default
-
dataplex-types.global.bigquery-dataset.type=default
-
-
Syntax for custom aspect types:
- If the aspect is created in the global region:
PROJECT_ID . ASPECT_TYPE_ID . FIELD_NAME
- If the aspect is created in a specific region:
PROJECT_ID . REGION . ASPECT_TYPE_ID . FIELD_NAME
For example, the following queries match entries where the value of the
is-enrolled
field in theemployee-info
aspect istrue
.-
example-project.us-central1.employee-info.is-enrolled=true
-
example-project.employee-info.is-enrolled=true
The list of supported operators depends on the type of field in the aspect, as follows:
- String
:
=
(exact match) - All number types
:
=
,:
,<
,>
,<=
,>=
,=>
,=<
- Enum
:
=
- Datetime : same as for numbers, but the values to compare are treated as datetimes instead of numbers
- Boolean
:
=
- If the aspect is created in the global region:
Only top-level fields of the aspect are searchable.
Logical operators
A query can consist of several predicates with logical operators. If you don't
specify an operator, logical AND
is implied. For example, foo bar
returns
resources that match both predicate foo
and predicate bar
.
Logical AND
and logical OR
are supported. For example, foo OR bar
.
You can negate a predicate with a -
(hyphen) or NOT
prefix. For example, -name:foo
returns resources with names that don't match the predicate foo
.
Abbreviated syntax
An abbreviated search syntax is also available, using |
(vertical bar) for OR
operators and ,
(comma) for AND
operators.
For example, to search for entries inside one of many projects using the OR
operator, you can use the following abbreviated syntax:
projectid:(id1|id2|id3|id4)
The same search without using abbreviated syntax looks like the following:
projectid:id1 OR projectid:id2 OR projectid:id3 OR projectid:id4
To search for entries with matching column names, use the following:
- AND:
column:(name1, name2, name3)
- OR:
column:(name1|name2|name3)
This abbreviated syntax works for the qualified predicates
except for label
in keyword
search.
What's next
- Learn how to search for resources in Dataplex Universal Catalog
- Learn more about metadata management in Dataplex Universal Catalog .
- Learn how to enrich entries with metadata by using aspects .
- Learn how to manage entries and ingest custom sources .