GcsSource

Cloud Storage location for input content.

JSON representation
 { 
 "inputUris" 
 : 
 [ 
 string 
 ] 
 , 
 "dataSchema" 
 : 
 string 
 } 
Fields
inputUris[]

string

Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example, gs://bucket/directory/object.json ) or a pattern matching one or more files, such as gs://bucket/directory/*.json .

A request can contain at most 100 files (or 100,000 files if dataSchema is content ). Each file can be up to 2 GB (or 100 MB if dataSchema is content ).

dataSchema

string

The schema to use when parsing the data from the source.

Supported values for document imports:

  • document (default): One JSON Document per line. Each document must have a valid Document.id .
  • content : Unstructured data (e.g. PDF, HTML). Each file matched by inputUris becomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string.
  • custom : One custom data JSON per row in arbitrary format that conforms to the defined Schema of the data store. This can only be used by the GENERIC Data Store vertical.
  • csv : A CSV file with header conforming to the defined Schema of the data store. Each entry after the header is imported as a Document. This can only be used by the GENERIC Data Store vertical.

Supported values for user event imports:

  • user_event (default): One JSON UserEvent per line.
Design a Mobile Site
View Site in Mobile | Classic
Share by: