- JSON representation
- Profile
- Field
- ProfileInfo
- TopNValue
- StringFieldInfo
- IntegerFieldInfo
- DoubleFieldInfo
- PostScanActionsResult
- BigQueryExportResult
- State
DataProfileResult defines the output of DataProfileScan. Each field of the table will have field type specific profile result.
JSON representation |
---|
{ "rowCount" : string , "profile" : { object ( |
Fields | |
---|---|
rowCount
|
The count of rows scanned. |
profile
|
The profile information per field. |
scannedData
|
The data scanned for this result. |
postScanActionsResult
|
Output only. The result of post scan actions. |
Profile
Contains name, type, mode and field type specific profile information.
JSON representation |
---|
{
"fields"
:
[
{
object (
|
Fields | |
---|---|
fields[]
|
List of fields with structural and profile information for each field. |
Field
A field within a table.
JSON representation |
---|
{
"name"
:
string
,
"type"
:
string
,
"mode"
:
string
,
"profile"
:
{
object (
|
name
string
The name of the field.
type
string
The data type retrieved from the schema of the data source. For instance, for a BigQuery native table, it is the BigQuery Table Schema . For a Dataplex Universal Catalog Entity, it is the Entity Schema .
mode
string
The mode of the field. Possible values include:
- REQUIRED, if it is a required field.
- NULLABLE, if it is an optional field.
- REPEATED, if it is a repeated field.
profile
object (
ProfileInfo
)
Profile information for the corresponding field.
ProfileInfo
The profile information for each field type.
JSON representation |
---|
{ "nullRatio" : number , "distinctRatio" : number , "topNValues" : [ { object ( |
nullRatio
number
Ratio of rows with null value against total scanned rows.
distinctRatio
number
Ratio of rows with distinct values against total scanned rows. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.
topNValues[]
object (
TopNValue
)
The list of top N non-null values, frequency and ratio with which they occur in the scanned data. N is 10 or equal to the number of distinct values in the field, whichever is smaller. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.
field_info
. Structural and profile information for specific field type. Not available, if mode is REPEATABLE. field_info
can be only one of the following:stringProfile
object (
StringFieldInfo
)
String type field information.
integerProfile
object (
IntegerFieldInfo
)
Integer type field information.
doubleProfile
object (
DoubleFieldInfo
)
Double type field information.
TopNValue
Top N non-null values in the scanned data.
JSON representation |
---|
{ "value" : string , "count" : string , "ratio" : number } |
Fields | |
---|---|
value
|
String value of a top N non-null value. |
count
|
Count of the corresponding value in the scanned data. |
ratio
|
Ratio of the corresponding value in the field against the total number of rows in the scanned data. |
StringFieldInfo
The profile information for a string type field.
JSON representation |
---|
{ "minLength" : string , "maxLength" : string , "averageLength" : number } |
Fields | |
---|---|
minLength
|
Minimum length of non-null values in the scanned data. |
maxLength
|
Maximum length of non-null values in the scanned data. |
averageLength
|
Average length of non-null values in the scanned data. |
IntegerFieldInfo
The profile information for an integer type field.
JSON representation |
---|
{ "average" : number , "standardDeviation" : number , "min" : string , "quartiles" : [ string ] , "max" : string } |
Fields | |
---|---|
average
|
Average of non-null values in the scanned data. NaN, if the field has a NaN. |
standardDeviation
|
Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN. |
min
|
Minimum of non-null values in the scanned data. NaN, if the field has a NaN. |
quartiles[]
|
A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of approximate quartile values for the scanned data, occurring in order Q1, median, Q3. |
max
|
Maximum of non-null values in the scanned data. NaN, if the field has a NaN. |
DoubleFieldInfo
The profile information for a double type field.
JSON representation |
---|
{ "average" : number , "standardDeviation" : number , "min" : number , "quartiles" : [ number ] , "max" : number } |
Fields | |
---|---|
average
|
Average of non-null values in the scanned data. NaN, if the field has a NaN. |
standardDeviation
|
Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN. |
min
|
Minimum of non-null values in the scanned data. NaN, if the field has a NaN. |
quartiles[]
|
A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of quartile values for the scanned data, occurring in order Q1, median, Q3. |
max
|
Maximum of non-null values in the scanned data. NaN, if the field has a NaN. |
PostScanActionsResult
The result of post scan actions of DataProfileScan job.
JSON representation |
---|
{
"bigqueryExportResult"
:
{
object (
|
Fields | |
---|---|
bigqueryExportResult
|
Output only. The result of BigQuery export post scan action. |
BigQueryExportResult
The result of BigQuery export post scan action.
JSON representation |
---|
{
"state"
:
enum (
|
Fields | |
---|---|
state
|
Output only. Execution state for the BigQuery exporting. |
State
Execution state for the exporting.
Enums | |
---|---|
STATE_UNSPECIFIED
|
The exporting state is unspecified. |
SUCCEEDED
|
The exporting completed successfully. |
FAILED
|
The exporting is no longer running due to an error. |
SKIPPED
|
The exporting is skipped due to no valid scan result to export (usually caused by scan failed). |