A singleton resource under a Processor
which configures a collection of documents.
JSON representation |
---|
{ "name" : string , "state" : enum ( |
name
string
Dataset resource name. Format: projects/{project}/locations/{location}/processors/{processor}/dataset
state
enum (
State
)
Required. State of the dataset. Ignored when updating dataset.
satisfiesPzs
boolean
Output only. Reserved for future use.
satisfiesPzi
boolean
Output only. Reserved for future use.
Union field storage_source
.
storage_source
can be only one of the following:
gcsManagedConfig
object (
GCSManagedConfig
)
Optional. User-managed Cloud Storage dataset configuration. Use this configuration if the dataset documents are stored under a user-managed Cloud Storage location.
documentWarehouseConfig
(deprecated)
object (
DocumentWarehouseConfig
)
Optional. Deprecated. Warehouse-based dataset configuration is not supported.
unmanagedDatasetConfig
object (
UnmanagedDatasetConfig
)
Optional. Unmanaged dataset configuration. Use this configuration if the dataset documents are managed by the document service internally (not user-managed).
Union field indexing_source
.
indexing_source
can be only one of the following:
spannerIndexingConfig
object (
SpannerIndexingConfig
)
Optional. A lightweight indexing source with low latency and high reliability, but lacking advanced features like CMEK and content-based search.
GCSManagedConfig
Configuration specific to the Cloud Storage-based implementation.
JSON representation |
---|
{
"gcsPrefix"
:
{
object (
|
Fields | |
---|---|
gcsPrefix
|
Required. The Cloud Storage URI (a directory) where the documents belonging to the dataset must be stored. |
GcsPrefix
Specifies all documents on Cloud Storage with a common prefix.
JSON representation |
---|
{ "gcsUriPrefix" : string } |
Fields | |
---|---|
gcsUriPrefix
|
The URI prefix. |
DocumentWarehouseConfig
Configuration specific to the Document AI Warehouse-based implementation.
JSON representation |
---|
{ "collection" : string , "schema" : string } |
Fields | |
---|---|
collection
|
Output only. The collection in Document AI Warehouse associated with the dataset. |
schema
|
Output only. The schema in Document AI Warehouse associated with the dataset. |
UnmanagedDatasetConfig
This type has no fields.
Configuration specific to an unmanaged dataset.
SpannerIndexingConfig
This type has no fields.
Configuration specific to spanner-based indexing.