DataQualityRule

A rule captures data quality intent about a data source.

JSON representation
 { 
 "column" 
 : 
 string 
 , 
 "ignoreNull" 
 : 
 boolean 
 , 
 "dimension" 
 : 
 string 
 , 
 "threshold" 
 : 
 number 
 , 
 "name" 
 : 
 string 
 , 
 "description" 
 : 
 string 
 , 
 "suspended" 
 : 
 boolean 
 , 
 "attributes" 
 : 
 { 
 string 
 : 
 string 
 , 
 ... 
 } 
 , 
 "ruleSource" 
 : 
 { 
 object (  RuleSource 
 
) 
 } 
 , 
 "debugQueries" 
 : 
 [ 
 { 
 object (  DebugQuery 
 
) 
 } 
 ] 
 , 
 // Union field rule_type 
can be only one of the following: 
 "rangeExpectation" 
 : 
 { 
 object (  RangeExpectation 
 
) 
 } 
 , 
 "nonNullExpectation" 
 : 
 { 
 object (  NonNullExpectation 
 
) 
 } 
 , 
 "setExpectation" 
 : 
 { 
 object (  SetExpectation 
 
) 
 } 
 , 
 "regexExpectation" 
 : 
 { 
 object (  RegexExpectation 
 
) 
 } 
 , 
 "uniquenessExpectation" 
 : 
 { 
 object (  UniquenessExpectation 
 
) 
 } 
 , 
 "statisticRangeExpectation" 
 : 
 { 
 object (  StatisticRangeExpectation 
 
) 
 } 
 , 
 "rowConditionExpectation" 
 : 
 { 
 object (  RowConditionExpectation 
 
) 
 } 
 , 
 "tableConditionExpectation" 
 : 
 { 
 object (  TableConditionExpectation 
 
) 
 } 
 , 
 "sqlAssertion" 
 : 
 { 
 object (  SqlAssertion 
 
) 
 } 
 , 
 "templateReference" 
 : 
 { 
 object (  TemplateReference 
 
) 
 } 
 // End of list of possible types for union field rule_type 
. 
 } 
Fields
column

string

Optional. The unnested column which this rule is evaluated against.

ignoreNull

boolean

Optional. Rows with null values will automatically fail a rule, unless ignoreNull is true . In that case, such null rows are trivially considered passing.

This field is only valid for the following type of rules:

  • RangeExpectation
  • RegexExpectation
  • SetExpectation
  • UniquenessExpectation
dimension

string

Optional. The dimension a rule belongs to. Results are also aggregated at the dimension level. Custom dimension name is supported with all uppercase letters and maximum length of 30 characters.

threshold

number

Optional. The minimum ratio of passing_rows / totalRowsrequired to pass this rule, with a range of [0.0, 1.0].

0 indicates default value (i.e. 1.0).

This field is only valid for row-level type rules.

name

string

Optional. A mutable name for the rule.

  • The name must contain only letters (a-z, A-Z), numbers (0-9), or hyphens (-).
  • The maximum length is 63 characters.
  • Must start with a letter.
  • Must end with a number or a letter.
description

string

Optional. Description of the rule.

  • The maximum length is 1,024 characters.
suspended

boolean

Optional. Whether the Rule is active or suspended. Default is false.

attributes

map (key: string, value: string)

Optional. Map of attribute name and value linked to the rule. The rules to evaluate can be filtered based on attributes provided here and a filter expression provided in the DataQualitySpec.filter field.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" } .

ruleSource

object ( RuleSource )

Output only. Contains information about the source of the rule and its relationship with the BigQuery table, where applicable.

debugQueries[]

object ( DebugQuery )

Optional. Specifies the debug queries for this rule. Currently, only one query is supported, but this may be expanded in the future.

Union field rule_type . The rule-specific configuration. rule_type can be only one of the following:
rangeExpectation

object ( RangeExpectation )

Row-level rule which evaluates whether each column value lies between a specified range.

nonNullExpectation

object ( NonNullExpectation )

Row-level rule which evaluates whether each column value is null.

setExpectation

object ( SetExpectation )

Row-level rule which evaluates whether each column value is contained by a specified set.

regexExpectation

object ( RegexExpectation )

Row-level rule which evaluates whether each column value matches a specified regex.

uniquenessExpectation

object ( UniquenessExpectation )

Row-level rule which evaluates whether each column value is unique.

statisticRangeExpectation

object ( StatisticRangeExpectation )

Aggregate rule which evaluates whether the column aggregate statistic lies between a specified range.

rowConditionExpectation

object ( RowConditionExpectation )

Row-level rule which evaluates whether each row in a table passes the specified condition.

tableConditionExpectation

object ( TableConditionExpectation )

Aggregate rule which evaluates whether the provided expression is true for a table.

sqlAssertion

object ( SqlAssertion )

Aggregate rule which evaluates the number of rows returned for the provided statement. If any rows are returned, this rule fails.

templateReference

object ( TemplateReference )

Aggregate rule which references a rule template and provides the parameters to be substituted in the template. If any rows are returned, this rule fails.

RangeExpectation

Evaluates whether each column value lies between a specified range.

JSON representation
 { 
 "minValue" 
 : 
 string 
 , 
 "maxValue" 
 : 
 string 
 , 
 "strictMinEnabled" 
 : 
 boolean 
 , 
 "strictMaxEnabled" 
 : 
 boolean 
 } 
Fields
minValue

string

Optional. The minimum column value allowed for a row to pass this validation. At least one of minValue and maxValue need to be provided.

maxValue

string

Optional. The maximum column value allowed for a row to pass this validation. At least one of minValue and maxValue need to be provided.

strictMinEnabled

boolean

Optional. Whether each value needs to be strictly greater than ('>') the minimum, or if equality is allowed.

Only relevant if a minValue has been defined. Default = false.

strictMaxEnabled

boolean

Optional. Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed.

Only relevant if a maxValue has been defined. Default = false.

NonNullExpectation

This type has no fields.

Evaluates whether each column value is null.

SetExpectation

Evaluates whether each column value is contained by a specified set.

JSON representation
 { 
 "values" 
 : 
 [ 
 string 
 ] 
 } 
Fields
values[]

string

Optional. Expected values for the column value.

RegexExpectation

Evaluates whether each column value matches a specified regex.

JSON representation
 { 
 "regex" 
 : 
 string 
 } 
Fields
regex

string

Optional. A regular expression the column value is expected to match.

UniquenessExpectation

This type has no fields.

Evaluates whether the column has duplicates.

StatisticRangeExpectation

Evaluates whether the column aggregate statistic lies between a specified range.

JSON representation
 { 
 "statistic" 
 : 
 enum (  ColumnStatistic 
 
) 
 , 
 "minValue" 
 : 
 string 
 , 
 "maxValue" 
 : 
 string 
 , 
 "strictMinEnabled" 
 : 
 boolean 
 , 
 "strictMaxEnabled" 
 : 
 boolean 
 } 
Fields
statistic

enum ( ColumnStatistic )

Optional. The aggregate metric to evaluate.

minValue

string

Optional. The minimum column statistic value allowed for a row to pass this validation.

At least one of minValue and maxValue need to be provided.

maxValue

string

Optional. The maximum column statistic value allowed for a row to pass this validation.

At least one of minValue and maxValue need to be provided.

strictMinEnabled

boolean

Optional. Whether column statistic needs to be strictly greater than ('>') the minimum, or if equality is allowed.

Only relevant if a minValue has been defined. Default = false.

strictMaxEnabled

boolean

Optional. Whether column statistic needs to be strictly lesser than ('<') the maximum, or if equality is allowed.

Only relevant if a maxValue has been defined. Default = false.

ColumnStatistic

The list of aggregate metrics a rule can be evaluated against.

Enums
STATISTIC_UNDEFINED Unspecified statistic type
MEAN Evaluate the column mean
MIN Evaluate the column min
MAX Evaluate the column max

RowConditionExpectation

Evaluates whether each row passes the specified condition.

The SQL expression needs to use GoogleSQL syntax and should produce a boolean value per row as the result.

Example: col1 >= 0 AND col2 < 10

JSON representation
 { 
 "sqlExpression" 
 : 
 string 
 } 
Fields
sqlExpression

string

Optional. The SQL expression.

TableConditionExpectation

Evaluates whether the provided expression is true.

The SQL expression needs to use GoogleSQL syntax and should produce a scalar boolean result.

Example: MIN(col1) >= 0

JSON representation
 { 
 "sqlExpression" 
 : 
 string 
 } 
Fields
sqlExpression

string

Optional. The SQL expression.

SqlAssertion

A SQL statement that is evaluated to return rows that match an invalid state. If any rows are are returned, this rule fails.

The SQL statement must use GoogleSQL syntax , and must not contain any semicolons.

You can use the data reference parameter ${data()} to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter .

Example: SELECT * FROM ${data()} WHERE price < 0

JSON representation
 { 
 "sqlStatement" 
 : 
 string 
 } 
Fields
sqlStatement

string

Optional. The SQL statement.

TemplateReference

A rule that constructs a SQL statement to evaluate using a rule template and parameter values. If the constructed statement returns any rows, this rule fails

JSON representation
 { 
 "name" 
 : 
 string 
 , 
 "values" 
 : 
 { 
 string 
 : 
 { 
 object (  ParameterValue 
 
) 
 } 
 , 
 ... 
 } 
 , 
 "resolvedSql" 
 : 
 string 
 , 
 "ruleTemplate" 
 : 
 { 
 object (  DataQualityRuleTemplate 
 
) 
 } 
 } 
Fields
name

string

Required. The template entry name. Entry must be of EntryType projects/dataplex-types/locations/global/entryTypes/data-quality-rule-template and contains top-level aspect of AspectType projects/dataplex-types/locations/global/aspectTypes/data-quality-rule-template . The format is: projects/{project_id_or_number}/locations/{locationId}/entryGroups/{entryGroupId}/entries/{entryId}

values

map (key: string, value: object ( ParameterValue ))

Optional. Provides the map of parameter name and value. The maximum size of the field is 120KB (encoded as UTF-8).

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" } .

resolvedSql

string

Output only. The resolved SQL statement generated from the template with parameters substituted. It is only populated in the result.

ruleTemplate

object ( DataQualityRuleTemplate )

Output only. The rule template used to resolve the rule. It is only populated in the result.

ParameterValue

Represents a parameter value.

JSON representation
 { 
 "value" 
 : 
 string 
 } 
Fields
value

string

Required. Represents the string value of the parameter.

DataQualityRuleTemplate

DataQualityRuleTemplate represents a template which can be reused across multiple data quality rules.

JSON representation
 { 
 "name" 
 : 
 string 
 , 
 "dimension" 
 : 
 string 
 , 
 "sqlCollection" 
 : 
 [ 
 { 
 object (  Sql 
 
) 
 } 
 ] 
 , 
 "inputParameters" 
 : 
 { 
 string 
 : 
 { 
 object (  ParameterDescription 
 
) 
 } 
 , 
 ... 
 } 
 , 
 "capabilities" 
 : 
 [ 
 string 
 ] 
 } 
Fields
name

string

Output only. The name of the rule template in the format: projects/{project_id_or_number}/locations/{locationId}/entryGroups/{entryGroupId}/entries/{entryId}

dimension

string

Output only. The dimension a rule template belongs to. Rule level results are also aggregated at the dimension level.

sqlCollection[]

object ( Sql )

Output only. Collection of SQLs for data quality rules. Currently only one SQL is supported.

inputParameters

map (key: string, value: object ( ParameterDescription ))

Output only. Description for input parameters

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" } .

capabilities[]

string

Output only. A list of features or properties supported by this rule template.

Sql

Templatized SQL query for data quality rules. It can have parameters that can be substituted with values when a rule is created using this template.

JSON representation
 { 
 "query" 
 : 
 string 
 } 
Fields
query

string

Output only. Templatized SQL query for data quality rules.

ParameterDescription

Description of the input parameter. It can include the type(s) supported by the parameter and intended usage. It is for information purposes only and does not affect the behavior of the rule template.

JSON representation
 { 
 "description" 
 : 
 string 
 , 
 "defaultValue" 
 : 
 string 
 } 
Fields
description

string

Output only. Description of the input parameter. It can include the type(s) supported by the parameter and intended usage. It is for information purposes only and does not affect the behavior of the rule template.

defaultValue

string

Output only. The default value for the parameter if no value is provided.

RuleSource

Represents the rule source information from Catalog.

JSON representation
 { 
 "rulePathElements" 
 : 
 [ 
 { 
 object (  RulePathElement 
 
) 
 } 
 ] 
 } 
Fields
rulePathElements[]

object ( RulePathElement )

Output only. Rule path elements represent information about the individual items in the relationship path between the scan resource and rule origin in that order.

RulePathElement

Path Element represents the direct relationship between the rule origin (aspects) to the BigQuery Entry. Ordering of the rule relationship will be maintained such that the first entry in the list is the closest ancestor (BigQuery table itself). A blank source denotes that the rule is derived directly from the DataScan itself.

JSON representation
 { 
 // Union field source_type 
can be only one of the following: 
 "entrySource" 
 : 
 { 
 object (  EntrySource 
 
) 
 } 
 , 
 "entryLinkSource" 
 : 
 { 
 object (  EntryLinkSource 
 
) 
 } 
 // End of list of possible types for union field source_type 
. 
 } 
Fields
Union field source_type . The source type of the rule. source_type can be only one of the following:
entrySource

object ( EntrySource )

Output only. Entry source represents information about the related source entry.

EntrySource

Entry source represents information about the related source entry.

JSON representation
 { 
 "entryType" 
 : 
 string 
 , 
 "entry" 
 : 
 string 
 , 
 "displayName" 
 : 
 string 
 } 
Fields
entryType

string

Output only. The entry type to represent the current characteristics of the entry in the form of: projects/{project_id_or_number}/locations/{locationId}/entryTypes/{entry-type-id} .

entry

string

Output only. The entry name in the form of: projects/{project_id_or_number}/locations/{locationId}/entryGroups/{entryGroupId}/entries/{entryId}

displayName

string

Output only. The display name of the entry.

EntryLinkSource

Entry link source represents information about the entry link.

JSON representation
 { 
 "entryLinkType" 
 : 
 string 
 , 
 "entryLink" 
 : 
 string 
 } 
Fields

DebugQuery

Specifies a SQL statement that is evaluated to return up to 10 scalar values that are used to debug rules. If the rule fails, the values can help diagnose the cause of the failure.

The SQL statement must use GoogleSQL syntax , and must not contain any semicolons.

You can use the data reference parameter ${data()} to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter .

You can also name results with an explicit alias using [AS] alias . For more information, see BigQuery explicit aliases .

Example: SELECT MIN(col1) AS min_col1, MAX(col1) AS max_col1 FROM ${data()}

JSON representation
 { 
 "description" 
 : 
 string 
 , 
 "sqlStatement" 
 : 
 string 
 } 
Fields
description

string

Optional. Specifies the description of the debug query.

  • The maximum length is 1,024 characters.
sqlStatement

string

Required. Specifies the SQL statement to be executed.

Design a Mobile Site
View Site in Mobile | Classic
Share by: