- Resource: Entity
- Type
- StorageSystem
- StorageFormat
- Format
- CompressionFormat
- CsvOptions
- JsonOptions
- IcebergOptions
- CompatibilityStatus
- Compatibility
- StorageAccess
- AccessMode
- Schema
- SchemaField
- Type
- Mode
- PartitionField
- PartitionStyle
- Methods
Resource: Entity
Represents tables and fileset metadata contained within a zone.
JSON representation |
---|
{ "name" : string , "displayName" : string , "description" : string , "createTime" : string , "updateTime" : string , "id" : string , "etag" : string , "type" : enum ( |
Fields | |
---|---|
name
|
Output only. The resource name of the entity, of the form: |
displayName
|
Optional. Display name must be shorter than or equal to 256 characters. |
description
|
Optional. User friendly longer description text. Must be shorter than or equal to 1024 characters. |
createTime
|
Output only. The time when the entity was created. Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
updateTime
|
Output only. The time when the entity was last updated. Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
id
|
Required. A user-provided entity ID. It is mutable, and will be used as the published table name. Specifying a new ID in an update entity request will override the existing value. The ID must contain only letters (a-z, A-Z), numbers (0-9), and underscores, and consist of 256 or fewer characters. |
etag
|
Optional. The etag associated with the entity, which can be retrieved with a [entities.get][] request. Required for update and delete requests. |
type
|
Required. Immutable. The type of entity. |
asset
|
Required. Immutable. The ID of the asset associated with the storage location containing the entity data. The entity must be with in the same zone with the asset. |
dataPath
|
Required. Immutable. The storage path of the entity data. For Cloud Storage data, this is the fully-qualified path to the entity, such as |
dataPathPattern
|
Optional. The set of items within the data path constituting the data in the entity, represented as a glob path. Example: |
catalogEntry
|
Output only. The name of the associated Data Catalog entry. |
system
|
Required. Immutable. Identifies the storage system of the entity data. |
format
|
Required. Identifies the storage format of the entity data. It does not apply to entities with data stored in BigQuery. |
compatibility
|
Output only. Metadata stores that the entity is compatible with. |
access
|
Output only. Identifies the access mechanism to the entity. Not user settable. |
uid
|
Output only. System generated unique ID for the Entity. This ID will be different if the Entity is deleted and re-created with the same name. |
schema
|
Required. The description of the data structure and layout. The schema is not included in list responses. It is only included in |
Type
The type of entity.
Enums | |
---|---|
TYPE_UNSPECIFIED
|
Type unspecified. |
TABLE
|
Structured and semi-structured data. |
FILESET
|
Unstructured data. |
StorageSystem
Identifies the cloud system that manages the data storage.
Enums | |
---|---|
STORAGE_SYSTEM_UNSPECIFIED
|
Storage system unspecified. |
CLOUD_STORAGE
|
The entity data is contained within a Cloud Storage bucket. |
BIGQUERY
|
The entity data is contained within a BigQuery dataset. |
StorageFormat
Describes the format of the data within its storage location.
JSON representation |
---|
{ "format" : enum ( |
format
enum (
Format
)
Output only. The data format associated with the stored data, which represents content type values. The value is inferred from mime type.
compressionFormat
enum (
CompressionFormat
)
Optional. The compression type associated with the stored data. If unspecified, the data is uncompressed.
mimeType
string
Required. The mime type descriptor for the data. Must match the pattern {type}/{subtype}. Supported values:
- application/x-parquet
- application/x-avro
- application/x-orc
- application/x-tfrecord
- application/x-parquet+iceberg
- application/x-avro+iceberg
- application/x-orc+iceberg
- application/json
- application/{subtypes}
- text/csv
- text/
- image/{image subtype}
- video/{video subtype}
- audio/{audio subtype}
csv
object (
CsvOptions
)
Optional. Additional information about CSV formatted data.
json
object (
JsonOptions
)
Optional. Additional information about CSV formatted data.
iceberg
object (
IcebergOptions
)
Optional. Additional information about iceberg tables.
Format
The specific file format of the data.
Enums | |
---|---|
FORMAT_UNSPECIFIED
|
Format unspecified. |
PARQUET
|
Parquet-formatted structured data. |
AVRO
|
Avro-formatted structured data. |
ORC
|
Orc-formatted structured data. |
CSV
|
Csv-formatted semi-structured data. |
JSON
|
Json-formatted semi-structured data. |
IMAGE
|
Image data formats (such as jpg and png). |
AUDIO
|
Audio data formats (such as mp3, and wav). |
VIDEO
|
Video data formats (such as mp4 and mpg). |
TEXT
|
Textual data formats (such as txt and xml). |
TFRECORD
|
TensorFlow record format. |
OTHER
|
Data that doesn't match a specific format. |
UNKNOWN
|
Data of an unknown format. |
CompressionFormat
The specific compressed file format of the data.
Enums | |
---|---|
COMPRESSION_FORMAT_UNSPECIFIED
|
CompressionFormat unspecified. Implies uncompressed data. |
GZIP
|
GZip compressed set of files. |
BZIP2
|
BZip2 compressed set of files. |
CsvOptions
Describes CSV and similar semi-structured data formats.
JSON representation |
---|
{ "encoding" : string , "headerRows" : integer , "delimiter" : string , "quote" : string } |
Fields | |
---|---|
encoding
|
Optional. The character encoding of the data. Accepts "US-ASCII", "UTF-8", and "ISO-8859-1". Defaults to UTF-8 if unspecified. |
headerRows
|
Optional. The number of rows to interpret as header rows that should be skipped when reading data rows. Defaults to 0. |
delimiter
|
Optional. The delimiter used to separate values. Defaults to ','. |
quote
|
Optional. The character used to quote column values. Accepts '"' (double quotation mark) or ''' (single quotation mark). Defaults to '"' (double quotation mark) if unspecified. |
JsonOptions
Describes JSON data format.
JSON representation |
---|
{ "encoding" : string } |
Fields | |
---|---|
encoding
|
Optional. The character encoding of the data. Accepts "US-ASCII", "UTF-8" and "ISO-8859-1". Defaults to UTF-8 if not specified. |
IcebergOptions
Describes Iceberg data format.
JSON representation |
---|
{ "metadataLocation" : string } |
Fields | |
---|---|
metadataLocation
|
Optional. The location of where the iceberg metadata is present, must be within the table path |
CompatibilityStatus
Provides compatibility information for various metadata stores.
JSON representation |
---|
{ "hiveMetastore" : { object ( |
Fields | |
---|---|
hiveMetastore
|
Output only. Whether this entity is compatible with Hive Metastore. |
bigquery
|
Output only. Whether this entity is compatible with BigQuery. |
Compatibility
Provides compatibility information for a specific metadata store.
JSON representation |
---|
{ "compatible" : boolean , "reason" : string } |
Fields | |
---|---|
compatible
|
Output only. Whether the entity is compatible and can be represented in the metadata store. |
reason
|
Output only. Provides additional detail if the entity is incompatible with the metadata store. |
StorageAccess
Describes the access mechanism of the data within its storage location.
JSON representation |
---|
{
"read"
:
enum (
|
Fields | |
---|---|
read
|
Output only. Describes the read access mechanism of the data. Not user settable. |
AccessMode
Access Mode determines how data stored within the Entity is read.
Enums | |
---|---|
ACCESS_MODE_UNSPECIFIED
|
Access mode unspecified. |
DIRECT
|
Default. Data is accessed directly using storage APIs. |
MANAGED
|
Data is accessed through a managed interface using BigQuery APIs. |
Schema
Schema information describing the structure and layout of the data.
JSON representation |
---|
{ "userManaged" : boolean , "fields" : [ { object ( |
userManaged
boolean
Required. Set to true
if user-managed or false
if managed by Dataplex Universal Catalog. The default is false
(managed by Dataplex Universal Catalog).
-
Set to
false
to enable Dataplex Universal Catalog discovery to update the schema. including new data discovery, schema inference, and schema evolution. Users retain the ability to input and edit the schema. Dataplex Universal Catalog treats schema input by the user as though produced by a previous Dataplex Universal Catalog discovery operation, and it will evolve the schema and take action based on that treatment. -
Set to
true
to fully manage the entity schema. This setting guarantees that Dataplex Universal Catalog will not change schema fields.
fields[]
object (
SchemaField
)
Optional. The sequence of fields describing data in table entities. Note:BigQuery SchemaFields are immutable.
partitionFields[]
object (
PartitionField
)
Optional. The sequence of fields describing the partition structure in entities. If this field is empty, there are no partitions within the data.
partitionStyle
enum (
PartitionStyle
)
Optional. The structure of paths containing partition data within the entity.
SchemaField
Represents a column field within a table schema.
JSON representation |
---|
{ "name" : string , "description" : string , "type" : enum ( |
Fields | |
---|---|
name
|
Required. The name of the field. Must contain only letters, numbers and underscores, with a maximum length of 767 characters, and must begin with a letter or underscore. |
description
|
Optional. User friendly field description. Must be less than or equal to 1024 characters. |
type
|
Required. The type of field. |
mode
|
Required. Additional field semantics. |
fields[]
|
Optional. Any nested field for complex types. |
Type
Type information for fields in schemas and partition schemas.
Enums | |
---|---|
TYPE_UNSPECIFIED
|
SchemaType unspecified. |
BOOLEAN
|
Boolean field. |
BYTE
|
Single byte numeric field. |
INT16
|
16-bit numeric field. |
INT32
|
32-bit numeric field. |
INT64
|
64-bit numeric field. |
FLOAT
|
Floating point numeric field. |
DOUBLE
|
Double precision numeric field. |
DECIMAL
|
Real value numeric field. |
STRING
|
Sequence of characters field. |
BINARY
|
Sequence of bytes field. |
TIMESTAMP
|
Date and time field. |
DATE
|
Date field. |
TIME
|
Time field. |
RECORD
|
Structured field. Nested fields that define the structure of the map. If all nested fields are nullable, this field represents a union. |
NULL
|
Null field that does not have values. |
Mode
Additional qualifiers to define field semantics.
Enums | |
---|---|
MODE_UNSPECIFIED
|
Mode unspecified. |
REQUIRED
|
The field has required semantics. |
NULLABLE
|
The field has optional semantics, and may be null. |
REPEATED
|
The field has repeated (0 or more) semantics, and is a list of values. |
PartitionField
Represents a key field within the entity's partition structure. You could have up to 20 partition fields, but only the first 10 partitions have the filtering ability due to performance consideration. Note:Partition fields are immutable.
JSON representation |
---|
{
"name"
:
string
,
"type"
:
enum (
|
Fields | |
---|---|
name
|
Required. Partition field name must consist of letters, numbers, and underscores only, with a maximum of length of 256 characters, and must begin with a letter or underscore.. |
type
|
Required. Immutable. The type of field. |
PartitionStyle
The structure of paths within the entity, which represent partitions.
Enums | |
---|---|
PARTITION_STYLE_UNSPECIFIED
|
PartitionStyle unspecified |
HIVE_COMPATIBLE
|
Partitions are hive-compatible. Examples: gs://bucket/path/to/table/dt=2019-10-31/lang=en
, gs://bucket/path/to/table/dt=2019-10-31/lang=en/late
. |
Methods |
|
---|---|
|
Create a metadata entity. |
|
Delete a metadata entity. |
|
Get a metadata entity. |
|
List metadata entities in a zone. |
|
Update a metadata entity. |