Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example,gs://bucket/directory/object.json) or a pattern matching one or more files, such asgs://bucket/directory/*.json.
A request can contain at most 100 files (or 100,000 files ifdataSchemaiscontent). Each file can be up to 2 GB (or 100 MB ifdataSchemaiscontent).
dataSchema
string
The schema to use when parsing the data from the source.
Supported values for document imports:
document(default): One JSONDocumentper line. Each document must have a validDocument.id.
content: Unstructured data (e.g. PDF, HTML). Each file matched byinputUrisbecomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string.
custom: One custom data JSON per row in arbitrary format that conforms to the definedSchemaof the data store. This can only be used by the GENERIC Data Store vertical.
csv: A CSV file with header conforming to the definedSchemaof the data store. Each entry after the header is imported as a Document. This can only be used by the GENERIC Data Store vertical.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-06-27 UTC."],[[["\u003cp\u003eInput data is defined using a JSON representation that specifies \u003ccode\u003einputUris\u003c/code\u003e for Cloud Storage file locations and a \u003ccode\u003edataSchema\u003c/code\u003e for data parsing.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003einputUris\u003c/code\u003e must be a list of strings pointing to Cloud Storage locations, each URI can be up to 2000 characters, and the request is capped to 100 files, or 100,000 if the schema is \u003ccode\u003econtent\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003edataSchema\u003c/code\u003e determines the structure of the input data, with options including \u003ccode\u003edocument\u003c/code\u003e, \u003ccode\u003econtent\u003c/code\u003e, \u003ccode\u003ecustom\u003c/code\u003e, and \u003ccode\u003ecsv\u003c/code\u003e for document imports, and \u003ccode\u003euser_event\u003c/code\u003e for user event imports.\u003c/p\u003e\n"],["\u003cp\u003eEach file referenced by the \u003ccode\u003einputUris\u003c/code\u003e can be up to 2 GB in size, or 100 MB when using the \u003ccode\u003econtent\u003c/code\u003e data schema.\u003c/p\u003e\n"]]],[],null,["# GcsSource\n\n- [JSON representation](#SCHEMA_REPRESENTATION)\n\nCloud Storage location for input content."]]