DataProfileResult

DataProfileResult defines the output of DataProfileScan. Each field of the table will have field type specific profile result.

JSON representation

JSON representation
{ "rowCount" : string , "profile" : { object ( `Profile` ) } , "scannedData" : { object ( `ScannedData` ) } , "postScanActionsResult" : { object ( `PostScanActionsResult` ) } , "catalogPublishingStatus" : { object ( `DataScanCatalogPublishingStatus` ) } }

 { 
 "rowCount" 
 : 
 string 
 , 
 "profile" 
 : 
 { 
 object (  Profile 
 
) 
 } 
 , 
 "scannedData" 
 : 
 { 
 object (  ScannedData 
 
) 
 } 
 , 
 "postScanActionsResult" 
 : 
 { 
 object (  PostScanActionsResult 
 
) 
 } 
 , 
 "catalogPublishingStatus" 
 : 
 { 
 object (  DataScanCatalogPublishingStatus 
 
) 
 } 
 }

Fields
`rowCount`	`string ( int64 format)` Output only. The count of rows scanned.
`profile`	`object ( Profile )` Output only. The profile information per field.
`scannedData`	`object ( ScannedData )` Output only. The data scanned for this result.
`postScanActionsResult`	`object ( PostScanActionsResult )` Output only. The result of post scan actions.
`catalogPublishingStatus`	`object ( DataScanCatalogPublishingStatus )` Output only. The status of publishing the data scan as Dataplex Universal Catalog metadata.

Profile

Contains name, type, mode and field type specific profile information.

JSON representation
{ "fields" : [ { object ( `Field` ) } ] }

Fields

Fields
`fields[]`	`object ( Field )` Output only. List of fields with structural and profile information for each field.

fields[]

object ( Field )

Output only. List of fields with structural and profile information for each field.

Field

A field within a table.

JSON representation
{ "name" : string , "type" : string , "mode" : string , "profile" : { object ( `ProfileInfo` ) } }

Fields

name

string

Output only. The name of the field.

type

string

Output only. The data type retrieved from the schema of the data source. For instance, for a BigQuery native table, it is the BigQuery Table Schema . For a Dataplex Universal Catalog Entity, it is the Entity Schema .

mode

string

Output only. The mode of the field. Possible values include:

REQUIRED, if it is a required field.
NULLABLE, if it is an optional field.
REPEATED, if it is a repeated field.

profile

object ( ProfileInfo )

Output only. Profile information for the corresponding field.

ProfileInfo

The profile information for each field type.

JSON representation

JSON representation
{ "nullRatio" : number , "distinctRatio" : number , "topNValues" : [ { object ( `TopNValue` ) } ] , // Union field `field_info` can be only one of the following: "stringProfile" : { object ( `StringFieldInfo` ) } , "integerProfile" : { object ( `IntegerFieldInfo` ) } , "doubleProfile" : { object ( `DoubleFieldInfo` ) } // End of list of possible types for union field `field_info` . }

 { 
 "nullRatio" 
 : 
 number 
 , 
 "distinctRatio" 
 : 
 number 
 , 
 "topNValues" 
 : 
 [ 
 { 
 object (  TopNValue 
 
) 
 } 
 ] 
 , 
 // Union field field_info 
can be only one of the following: 
 "stringProfile" 
 : 
 { 
 object (  StringFieldInfo 
 
) 
 } 
 , 
 "integerProfile" 
 : 
 { 
 object (  IntegerFieldInfo 
 
) 
 } 
 , 
 "doubleProfile" 
 : 
 { 
 object (  DoubleFieldInfo 
 
) 
 } 
 // End of list of possible types for union field field_info 
. 
 }

Fields

nullRatio

number

Output only. Ratio of rows with null value against total scanned rows.

distinctRatio

number

Output only. Ratio of rows with distinct values against total scanned rows. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.

topNValues[]

object ( TopNValue )

Output only. The list of top N non-null values, frequency and ratio with which they occur in the scanned data. N is 10 or equal to the number of distinct values in the field, whichever is smaller. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.

Union field field_info . Structural and profile information for specific field type. Not available, if mode is REPEATABLE. field_info can be only one of the following:

stringProfile

object ( StringFieldInfo )

String type field information.

integerProfile

object ( IntegerFieldInfo )

Integer type field information.

doubleProfile

object ( DoubleFieldInfo )

Double type field information.

TopNValue

Top N non-null values in the scanned data.

JSON representation
{ "value" : string , "count" : string , "ratio" : number }

Fields

Fields
`value`	`string` Output only. String value of a top N non-null value.
`count`	`string ( int64 format)` Output only. Count of the corresponding value in the scanned data.
`ratio`	`number` Output only. Ratio of the corresponding value in the field against the total number of rows in the scanned data.

value

string

Output only. String value of a top N non-null value.

count

string ( int64 format)

Output only. Count of the corresponding value in the scanned data.

ratio

number

Output only. Ratio of the corresponding value in the field against the total number of rows in the scanned data.

StringFieldInfo

The profile information for a string type field.

JSON representation
{ "minLength" : string , "maxLength" : string , "averageLength" : number }

Fields

Fields
`minLength`	`string ( int64 format)` Output only. Minimum length of non-null values in the scanned data.
`maxLength`	`string ( int64 format)` Output only. Maximum length of non-null values in the scanned data.
`averageLength`	`number` Output only. Average length of non-null values in the scanned data.

minLength

string ( int64 format)

Output only. Minimum length of non-null values in the scanned data.

maxLength

string ( int64 format)

Output only. Maximum length of non-null values in the scanned data.

averageLength

number

Output only. Average length of non-null values in the scanned data.

IntegerFieldInfo

The profile information for an integer type field.

JSON representation
{ "average" : number , "standardDeviation" : number , "min" : string , "quartiles" : [ string ] , "max" : string }

Fields
`average`	`number` Output only. Average of non-null values in the scanned data. NaN, if the field has a NaN.
`standardDeviation`	`number` Output only. Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN.
`min`	`string ( int64 format)` Output only. Minimum of non-null values in the scanned data. NaN, if the field has a NaN.
`quartiles[]`	`string ( int64 format)` Output only. A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of approximate quartile values for the scanned data, occurring in order Q1, median, Q3.
`max`	`string ( int64 format)` Output only. Maximum of non-null values in the scanned data. NaN, if the field has a NaN.

DoubleFieldInfo

The profile information for a double type field.

JSON representation
{ "average" : number , "standardDeviation" : number , "min" : number , "quartiles" : [ number ] , "max" : number }

Fields
`average`	`number` Output only. Average of non-null values in the scanned data. NaN, if the field has a NaN.
`standardDeviation`	`number` Output only. Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN.
`min`	`number` Output only. Minimum of non-null values in the scanned data. NaN, if the field has a NaN.
`quartiles[]`	`number` Output only. A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of quartile values for the scanned data, occurring in order Q1, median, Q3.
`max`	`number` Output only. Maximum of non-null values in the scanned data. NaN, if the field has a NaN.

PostScanActionsResult

The result of post scan actions of DataProfileScan job.

JSON representation
{ "bigqueryExportResult" : { object ( `BigQueryExportResult` ) } }

Fields

Fields
`bigqueryExportResult`	`object ( BigQueryExportResult )` Output only. The result of BigQuery export post scan action.

bigqueryExportResult

object ( BigQueryExportResult )

Output only. The result of BigQuery export post scan action.

BigQueryExportResult

The result of BigQuery export post scan action.

JSON representation
{ "state" : enum ( `State` ) , "message" : string }

Fields

Fields
`state`	`enum ( State )` Output only. Execution state for the BigQuery exporting.
`message`	`string` Output only. Additional information about the BigQuery exporting.

state

enum ( State )

Output only. Execution state for the BigQuery exporting.

message

string

Output only. Additional information about the BigQuery exporting.

State

Execution state for the exporting.

Enums
`STATE_UNSPECIFIED`	The exporting state is unspecified.
`SUCCEEDED`	The exporting completed successfully.
`FAILED`	The exporting is no longer running due to an error.
`SKIPPED`	The exporting is skipped due to no valid scan result to export (usually caused by scan failed).

DataProfileResult Stay organized with collections Save and categorize content based on your preferences.

Profile

Field

ProfileInfo

TopNValue

StringFieldInfo

IntegerFieldInfo

DoubleFieldInfo

PostScanActionsResult

BigQueryExportResult

State

DataProfileResult