Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example,gs://bucket/directory/object.json) or a pattern matching one or more files, such asgs://bucket/directory/*.json.
A request can contain at most 100 files (or 100,000 files ifdataSchemaiscontent). Each file can be up to 2 GB (or 100 MB ifdataSchemaiscontent).
dataSchema
string
The schema to use when parsing the data from the source.
Supported values for document imports:
document(default): One JSONDocumentper line. Each document must have a validDocument.id.
content: Unstructured data (e.g. PDF, HTML). Each file matched byinputUrisbecomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string.
custom: One custom data JSON per row in arbitrary format that conforms to the definedSchemaof the data store. This can only be used by the GENERIC Data Store vertical.
csv: A CSV file with header conforming to the definedSchemaof the data store. Each entry after the header is imported as a Document. This can only be used by the GENERIC Data Store vertical.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-06-27 UTC."],[[["\u003cp\u003eInput content must be located in Cloud Storage, specified by URIs in the \u003ccode\u003einputUris\u003c/code\u003e field, with each URI potentially matching multiple files.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003einputUris\u003c/code\u003e field is a required array of strings, each URI up to 2000 characters long, with a maximum of 100 or 100,000 files based on the data schema.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003edataSchema\u003c/code\u003e field determines the data parsing format, supporting options like \u003ccode\u003edocument\u003c/code\u003e, \u003ccode\u003econtent\u003c/code\u003e, \u003ccode\u003ecustom\u003c/code\u003e, \u003ccode\u003ecsv\u003c/code\u003e for document imports, and \u003ccode\u003euser_event\u003c/code\u003e for user event imports.\u003c/p\u003e\n"],["\u003cp\u003eEach file from a URI can be up to 2 GB, or 100MB if \u003ccode\u003edataSchema\u003c/code\u003e is set to \u003ccode\u003econtent\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eEach document must contain a valid document ID if the \u003ccode\u003edataSchema\u003c/code\u003e is set to \u003ccode\u003edocument\u003c/code\u003e, while if the \u003ccode\u003edataSchema\u003c/code\u003e is \u003ccode\u003econtent\u003c/code\u003e, then the ID is set to the first 128 bits of SHA256(URI) encoded as a hex string.\u003c/p\u003e\n"]]],[],null,["# GcsSource\n\n- [JSON representation](#SCHEMA_REPRESENTATION)\n\nCloud Storage location for input content."]]