Cloud Storage location for input content.
JSON representation |
---|
{ "inputUris" : [ string ] , "dataSchema" : string } |
inputUris[]
string
Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example, gs://bucket/directory/object.json
) or a pattern matching one or more files, such as gs://bucket/directory/*.json
.
A request can contain at most 100 files (or 100,000 files if dataSchema
is content
). Each file can be up to 2 GB (or 100 MB if dataSchema
is content
).
dataSchema
string
The schema to use when parsing the data from the source.
Supported values for document imports:
-
document
(default): One JSONDocument
per line. Each document must have a validDocument.id
. -
content
: Unstructured data (e.g. PDF, HTML). Each file matched byinputUris
becomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string. -
custom
: One custom data JSON per row in arbitrary format that conforms to the definedSchema
of the data store. This can only be used by the GENERIC Data Store vertical. -
csv
: A CSV file with header conforming to the definedSchema
of the data store. Each entry after the header is imported as a Document. This can only be used by the GENERIC Data Store vertical.
Supported values for user event imports:
-
user_event
(default): One JSONUserEvent
per line.