- NAME
-
- gcloud dataproc batches submit spark-sql - submit a Spark SQL batch job
- SYNOPSIS
-
-
gcloud dataproc batches submit spark-sqlSQL_SCRIPT[--async] [--batch=BATCH] [--container-image=CONTAINER_IMAGE] [--deps-bucket=DEPS_BUCKET] [--history-server-cluster=HISTORY_SERVER_CLUSTER] [--jars=[JAR, …]] [--kms-key=KMS_KEY] [--labels=[KEY=VALUE, …]] [--metastore-service=METASTORE_SERVICE] [--properties=[PROPERTY=VALUE, …]] [--region=REGION] [--request-id=REQUEST_ID] [--resource-manager-tag=KEY=VALUE] [--service-account=SERVICE_ACCOUNT] [--staging-bucket=STAGING_BUCKET] [--tags=[TAGS, …]] [--ttl=TTL] [--user-workload-authentication-type=USER_WORKLOAD_AUTHENTICATION_TYPE] [--vars=[NAME=VALUE, …]] [--version=VERSION] [--network=NETWORK|--subnet=SUBNET] [GCLOUD_WIDE_FLAG …]
-
- DESCRIPTION
- Submit a Spark SQL batch job.
- EXAMPLES
- To submit a Spark SQL job running "my-sql-script.sql" and upload it to
"gs://my-bucket", run:
gcloud dataproc batches submit spark-sql my-sql-script.sql --deps-bucket = gs://my-bucket --region = us-central1 --vars = "NAME=VALUE,NAME2=VALUE2" - POSITIONAL ARGUMENTS
-
-
SQL_SCRIPT - URI of the script that contains Spark SQL queries to execute.
-
- FLAGS
-
-
--async - Return immediately without waiting for the operation in progress to complete.
-
--batch=BATCH - The ID of the batch job to submit. The ID must contain only lowercase letters (a-z), numbers (0-9) and hyphens (-). The length of the name must be between 4 and 63 characters. If this argument is not provided, a random generated UUID will be used.
-
--container-image=CONTAINER_IMAGE - Optional custom container image to use for the batch/session runtime environment. If not specified, a default container image will be used. The value should follow the container image naming format: {registry}/{repository}/{name}:{tag}, for example, gcr.io/my-project/my-image:1.2.3
-
--deps-bucket=DEPS_BUCKET - A Cloud Storage bucket to upload workload dependencies.
-
--history-server-cluster=HISTORY_SERVER_CLUSTER - Spark History Server configuration for the batch/session job. Resource name of an existing Dataproc cluster to act as a Spark History Server for the workload in the format: "projects/{project_id}/regions/{region}/clusters/{cluster_name}".
-
--jars=[JAR,…] - Comma-separated list of jar files to be provided to the classpaths.
-
--kms-key=KMS_KEY - Cloud KMS key to use for encryption.
-
--labels=[KEY=VALUE,…] - List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens (
-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers. - Name of a Dataproc Metastore service to be used as an external metastore in the format: "projects/{project-id}/locations/{region}/services/{service-name}".
-
--properties=[PROPERTY=VALUE,…] - Specifies configuration properties for the workload. See Dataproc Serverless for Spark documentation for the list of supported properties.
- Region resource - Dataproc region to use. Each Dataproc region constitutes an
independent resource namespace constrained to deploying instances into Compute
Engine zones inside the region. This represents a Cloud resource. (NOTE) Some
attributes are not given arguments in this group but can be set in other ways.
To set the
projectattribute:- provide the argument
--regionon the command line with a fully specified name; - set the property
dataproc/regionwith a fully specified name; - provide the argument
--projecton the command line; - set the property
core/project.
- provide the argument
-
--region=REGION - ID of the region or fully qualified identifier for the region.
To set the
regionattribute:- provide the argument
--regionon the command line; - set the property
dataproc/region.
- provide the argument
-
--request-id=REQUEST_ID - A unique ID that identifies the request. If the service receives two batch
create requests with the same request_id, the second request is ignored and the
operation that corresponds to the first batch created and stored in the backend
is returned. Recommendation: Always set this value to a UUID. The value must
contain only letters (a-z, A-Z), numbers (0-9), underscores (
), and hyphens (-). The maximum length is 40 characters. -
--resource-manager-tag=KEY=VALUE - Resource Manager Tags to be associated with the compute resources created for the workload. Only one key-value pair can be specified per flag. Repeat the flag to specify multiple tags.
-
--service-account=SERVICE_ACCOUNT - The IAM service account to be used for a batch/session job.
-
--staging-bucket=STAGING_BUCKET - The Cloud Storage bucket to use to store job dependencies, config files, and job driver console output. If not specified, the default [staging bucket] (https://cloud.google.com/dataproc-serverless/docs/concepts/buckets) is used.
- Network tags for traffic control.
-
--ttl=TTL - The duration after the workload will be unconditionally terminated, for example, '20m' or '1h'. Run gcloud topic datetimes for information on duration formats.
-
--user-workload-authentication-type=USER_WORKLOAD_AUTHENTICATION_TYPE - Whether to use END_USER_CREDENTIALS or SERVICE_ACCOUNT to run the workload.
-
--vars=[NAME=VALUE,…] - Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).
-
--version=VERSION - Optional runtime version. If not specified, a default version will be used.
- At most one of these can be specified:
-
--network=NETWORK - Network URI to connect network to.
-
--subnet=SUBNET - Subnetwork URI to connect network to. Subnet must have Private Google Access enabled.
-
-
- GCLOUD WIDE FLAGS
- These flags are available to all commands:
--access-token-file,--account,--billing-project,--configuration,--flags-file,--flatten,--format,--help,--impersonate-service-account,--log-http,--project,--quiet,--trace-token,--user-output-enabled,--verbosity.Run
$ gcloud helpfor details. - NOTES
- This variant is also available:
gcloud beta dataproc batches submit spark-sql
gcloud dataproc batches submit spark-sql
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-05-27 UTC.

