REST Resource: projects.locations.dataScans

Resource: DataScan

Represents a user-visible job which provides the insights for the related data source.

For example:

  • Data quality: generates queries based on the rules and runs against the data to get data quality check results. For more information, see Auto data quality overview .
  • Data profile: analyzes the data in tables and generates insights about the structure, content and relationships (such as null percent, cardinality, min/max/mean, etc). For more information, see About data profiling .
  • Data discovery: scans data in Cloud Storage buckets to extract and then catalog metadata. For more information, see Discover and catalog Cloud Storage data .
JSON representation
 { 
 "name" 
 : 
 string 
 , 
 "uid" 
 : 
 string 
 , 
 "description" 
 : 
 string 
 , 
 "displayName" 
 : 
 string 
 , 
 "labels" 
 : 
 { 
 string 
 : 
 string 
 , 
 ... 
 } 
 , 
 "state" 
 : 
 enum (  State 
 
) 
 , 
 "createTime" 
 : 
 string 
 , 
 "updateTime" 
 : 
 string 
 , 
 "data" 
 : 
 { 
 object (  DataSource 
 
) 
 } 
 , 
 "executionSpec" 
 : 
 { 
 object (  ExecutionSpec 
 
) 
 } 
 , 
 "executionStatus" 
 : 
 { 
 object (  ExecutionStatus 
 
) 
 } 
 , 
 "type" 
 : 
 enum (  DataScanType 
 
) 
 , 
 // Union field spec 
can be only one of the following: 
 "dataQualitySpec" 
 : 
 { 
 object (  DataQualitySpec 
 
) 
 } 
 , 
 "dataProfileSpec" 
 : 
 { 
 object (  DataProfileSpec 
 
) 
 } 
 , 
 "dataDiscoverySpec" 
 : 
 { 
 object (  DataDiscoverySpec 
 
) 
 } 
 // End of list of possible types for union field spec 
. 
 // Union field result 
can be only one of the following: 
 "dataQualityResult" 
 : 
 { 
 object (  DataQualityResult 
 
) 
 } 
 , 
 "dataProfileResult" 
 : 
 { 
 object (  DataProfileResult 
 
) 
 } 
 , 
 "dataDiscoveryResult" 
 : 
 { 
 object (  DataDiscoveryResult 
 
) 
 } 
 // End of list of possible types for union field result 
. 
 } 
Fields
name

string

Output only. Identifier. The relative resource name of the scan, of the form: projects/{project}/locations/{locationId}/dataScans/{datascanId} , where project refers to a projectId or project_number and locationId refers to a Google Cloud region.

uid

string

Output only. System generated globally unique ID for the scan. This ID will be different if the scan is deleted and re-created with the same name.

description

string

Optional. Description of the scan.

  • Must be between 1-1024 characters.
displayName

string

Optional. User friendly display name.

  • Must be between 1-256 characters.
labels

map (key: string, value: string)

Optional. User-defined labels for the scan.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" } .

state

enum ( State )

Output only. Current state of the DataScan.

createTime

string ( Timestamp format)

Output only. The time when the scan was created.

Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z" , "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30" .

updateTime

string ( Timestamp format)

Output only. The time when the scan was last updated.

Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z" , "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30" .

data

object ( DataSource )

Required. The data source for DataScan.

executionSpec

object ( ExecutionSpec )

Optional. DataScan execution settings.

If not specified, the fields in it will use their default values.

executionStatus

object ( ExecutionStatus )

Output only. Status of the data scan execution.

type

enum ( DataScanType )

Output only. The type of DataScan.

Union field spec . Data scan related setting. The settings are required and immutable. After you configure the settings for one type of data scan, you can't change the data scan to a different type of data scan. spec can be only one of the following:
dataQualitySpec

object ( DataQualitySpec )

Settings for a data quality scan.

dataProfileSpec

object ( DataProfileSpec )

Settings for a data profile scan.

dataDiscoverySpec

object ( DataDiscoverySpec )

Settings for a data discovery scan.

Union field result . The result of the data scan. result can be only one of the following:
dataQualityResult

object ( DataQualityResult )

Output only. The result of a data quality scan.

dataProfileResult

object ( DataProfileResult )

Output only. The result of a data profile scan.

dataDiscoveryResult

object ( DataDiscoveryResult )

Output only. The result of a data discovery scan.

DataSource

The data source for DataScan.

JSON representation
 { 
 // Union field source 
can be only one of the following: 
 "entity" 
 : 
 string 
 , 
 "resource" 
 : 
 string 
 // End of list of possible types for union field source 
. 
 } 
Fields
Union field source . The source is required and immutable. Once it is set, it cannot be change to others. source can be only one of the following:
entity

string

Immutable. The Dataplex Universal Catalog entity that represents the data source (e.g. BigQuery table) for DataScan, of the form: projects/{project_number}/locations/{locationId}/lakes/{lakeId}/zones/{zoneId}/entities/{entityId} .

resource

string

Immutable. The service-qualified full resource name of the cloud resource for a DataScan job to scan against. The field could either be: Cloud Storage bucket for DataDiscoveryScan Format: //storage.googleapis.com/projects/PROJECT_ID/buckets/BUCKET_ID or BigQuery table of type "TABLE" for DataProfileScan/DataQualityScan Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID

ExecutionSpec

DataScan execution settings.

JSON representation
 { 
 "trigger" 
 : 
 { 
 object (  Trigger 
 
) 
 } 
 , 
 // Union field incremental 
can be only one of the following: 
 "field" 
 : 
 string 
 // End of list of possible types for union field incremental 
. 
 } 
Fields
trigger

object ( Trigger )

Optional. Spec related to how often and when a scan should be triggered.

If not specified, the default is OnDemand , which means the scan will not run until the user calls dataScans.run API.

Union field incremental . Spec related to incremental scan of the data

When an option is selected for incremental scan, it cannot be unset or changed. If not specified, a data scan will run for all data in the table. incremental can be only one of the following:

field

string

Immutable. The unnested field (of type Date or Timestamp ) that contains values which monotonically increase over time.

If not specified, a data scan will run for all data in the table.

Trigger

DataScan scheduling and trigger settings.

JSON representation
 { 
 // Union field mode 
can be only one of the following: 
 "onDemand" 
 : 
 { 
 object (  OnDemand 
 
) 
 } 
 , 
 "schedule" 
 : 
 { 
 object (  Schedule 
 
) 
 } 
 // End of list of possible types for union field mode 
. 
 } 
Fields

Union field mode . DataScan scheduling and trigger settings.

If not specified, the default is onDemand . mode can be only one of the following:

onDemand

object ( OnDemand )

The scan runs once via dataScans.run API.

schedule

object ( Schedule )

The scan is scheduled to run periodically.

OnDemand

This type has no fields.

The scan runs once via dataScans.run API.

Schedule

The scan is scheduled to run periodically.

JSON representation
 { 
 "cron" 
 : 
 string 
 } 
Fields
cron

string

Required. Cron schedule for running scans periodically.

To explicitly set a timezone in the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}"or "TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE}may only be a valid string from IANA time zone database ( wikipedia ). For example, CRON_TZ=America/New_York 1 * * * * , or TZ=America/New_York 1 * * * * .

This field is required for Schedule scans.

ExecutionStatus

Status of the data scan execution.

JSON representation
 { 
 "latestJobStartTime" 
 : 
 string 
 , 
 "latestJobEndTime" 
 : 
 string 
 , 
 "latestJobCreateTime" 
 : 
 string 
 } 
Fields
latestJobStartTime

string ( Timestamp format)

Optional. The time when the latest DataScanJob started.

Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z" , "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30" .

latestJobEndTime

string ( Timestamp format)

Optional. The time when the latest DataScanJob ended.

Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z" , "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30" .

latestJobCreateTime

string ( Timestamp format)

Optional. The time when the DataScanJob execution was created.

Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z" , "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30" .

Methods

create

Creates a DataScan resource.

delete

Deletes a DataScan resource.

generateDataQualityRules

Generates recommended data quality rules based on the results of a data profiling scan.

get

Gets a DataScan resource.

getIamPolicy

Gets the access control policy for a resource.

list

Lists DataScans.

patch

Updates a DataScan resource.

run

Runs an on-demand execution of a DataScan

setIamPolicy

Sets the access control policy on the specified resource.

testIamPermissions

Returns permissions that a caller has on the specified resource.
Design a Mobile Site
View Site in Mobile | Classic
Share by: