- JSON representation
- Profile
- Field
- ProfileInfo
- TopNValue
- StringFieldInfo
- IntegerFieldInfo
- DoubleFieldInfo
- PostScanActionsResult
- BigQueryExportResult
- State
DataProfileResult defines the output of DataProfileScan. Each field of the table will have field type specific profile result.
| JSON representation |
|---|
{ "rowCount" : string , "profile" : { object ( |
| Fields | |
|---|---|
rowCount
|
Output only. The count of rows scanned. |
profile
|
Output only. The profile information per field. |
scannedData
|
Output only. The data scanned for this result. |
postScanActionsResult
|
Output only. The result of post scan actions. |
catalogPublishingStatus
|
Output only. The status of publishing the data scan as Dataplex Universal Catalog metadata. |
Profile
Contains name, type, mode and field type specific profile information.
| JSON representation |
|---|
{
"fields"
:
[
{
object (
|
| Fields | |
|---|---|
fields[]
|
Output only. List of fields with structural and profile information for each field. |
Field
A field within a table.
| JSON representation |
|---|
{
"name"
:
string
,
"type"
:
string
,
"mode"
:
string
,
"profile"
:
{
object (
|
name
string
Output only. The name of the field.
type
string
Output only. The data type retrieved from the schema of the data source. For instance, for a BigQuery native table, it is the BigQuery Table Schema . For a Dataplex Universal Catalog Entity, it is the Entity Schema .
mode
string
Output only. The mode of the field. Possible values include:
- REQUIRED, if it is a required field.
- NULLABLE, if it is an optional field.
- REPEATED, if it is a repeated field.
profile
object (
ProfileInfo
)
Output only. Profile information for the corresponding field.
ProfileInfo
The profile information for each field type.
| JSON representation |
|---|
{ "nullRatio" : number , "distinctRatio" : number , "topNValues" : [ { object ( |
nullRatio
number
Output only. Ratio of rows with null value against total scanned rows.
distinctRatio
number
Output only. Ratio of rows with distinct values against total scanned rows. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.
topNValues[]
object (
TopNValue
)
Output only. The list of top N non-null values, frequency and ratio with which they occur in the scanned data. N is 10 or equal to the number of distinct values in the field, whichever is smaller. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.
field_info
. Structural and profile information for specific field type. Not available, if mode is REPEATABLE. field_info
can be only one of the following:stringProfile
object (
StringFieldInfo
)
String type field information.
integerProfile
object (
IntegerFieldInfo
)
Integer type field information.
doubleProfile
object (
DoubleFieldInfo
)
Double type field information.
TopNValue
Top N non-null values in the scanned data.
| JSON representation |
|---|
{ "value" : string , "count" : string , "ratio" : number } |
| Fields | |
|---|---|
value
|
Output only. String value of a top N non-null value. |
count
|
Output only. Count of the corresponding value in the scanned data. |
ratio
|
Output only. Ratio of the corresponding value in the field against the total number of rows in the scanned data. |
StringFieldInfo
The profile information for a string type field.
| JSON representation |
|---|
{ "minLength" : string , "maxLength" : string , "averageLength" : number } |
| Fields | |
|---|---|
minLength
|
Output only. Minimum length of non-null values in the scanned data. |
maxLength
|
Output only. Maximum length of non-null values in the scanned data. |
averageLength
|
Output only. Average length of non-null values in the scanned data. |
IntegerFieldInfo
The profile information for an integer type field.
| JSON representation |
|---|
{ "average" : number , "standardDeviation" : number , "min" : string , "quartiles" : [ string ] , "max" : string } |
| Fields | |
|---|---|
average
|
Output only. Average of non-null values in the scanned data. NaN, if the field has a NaN. |
standardDeviation
|
Output only. Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN. |
min
|
Output only. Minimum of non-null values in the scanned data. NaN, if the field has a NaN. |
quartiles[]
|
Output only. A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of approximate quartile values for the scanned data, occurring in order Q1, median, Q3. |
max
|
Output only. Maximum of non-null values in the scanned data. NaN, if the field has a NaN. |
DoubleFieldInfo
The profile information for a double type field.
| JSON representation |
|---|
{ "average" : number , "standardDeviation" : number , "min" : number , "quartiles" : [ number ] , "max" : number } |
| Fields | |
|---|---|
average
|
Output only. Average of non-null values in the scanned data. NaN, if the field has a NaN. |
standardDeviation
|
Output only. Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN. |
min
|
Output only. Minimum of non-null values in the scanned data. NaN, if the field has a NaN. |
quartiles[]
|
Output only. A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of quartile values for the scanned data, occurring in order Q1, median, Q3. |
max
|
Output only. Maximum of non-null values in the scanned data. NaN, if the field has a NaN. |
PostScanActionsResult
The result of post scan actions of DataProfileScan job.
| JSON representation |
|---|
{
"bigqueryExportResult"
:
{
object (
|
| Fields | |
|---|---|
bigqueryExportResult
|
Output only. The result of BigQuery export post scan action. |
BigQueryExportResult
The result of BigQuery export post scan action.
| JSON representation |
|---|
{
"state"
:
enum (
|
| Fields | |
|---|---|
state
|
Output only. Execution state for the BigQuery exporting. |
State
Execution state for the exporting.
| Enums | |
|---|---|
STATE_UNSPECIFIED
|
The exporting state is unspecified. |
SUCCEEDED
|
The exporting completed successfully. |
FAILED
|
The exporting is no longer running due to an error. |
SKIPPED
|
The exporting is skipped due to no valid scan result to export (usually caused by scan failed). |

