- NAME
-
- gcloud dataproc jobs submit spark-sql - submit a Spark SQL job to a cluster
- SYNOPSIS
-
-
gcloud dataproc jobs submit spark-sql(--cluster=CLUSTER|--cluster-labels=[KEY=VALUE, …]) (--execute=QUERY,-eQUERY|--file=FILE,-fFILE) [--async] [--bucket=BUCKET] [--driver-log-levels=[PACKAGE=LEVEL, …]] [--driver-required-memory-mb=DRIVER_REQUIRED_MEMORY_MB] [--driver-required-vcores=DRIVER_REQUIRED_VCORES] [--jars=[JAR, …]] [--labels=[KEY=VALUE, …]] [--max-failures-per-hour=MAX_FAILURES_PER_HOUR] [--max-failures-total=MAX_FAILURES_TOTAL] [--params=[PARAM=VALUE, …]] [--properties=[PROPERTY=VALUE, …]] [--properties-file=PROPERTIES_FILE] [--region=REGION] [GCLOUD_WIDE_FLAG …]
-
- DESCRIPTION
- Submit a Spark SQL job to a cluster.
- EXAMPLES
- To submit a Spark SQL job with a local script, run:
gcloud dataproc jobs submit spark-sql --cluster = my-cluster --file = my_queries.qlTo submit a Spark SQL job with inline queries, run:
gcloud dataproc jobs submit spark-sql --cluster = my-cluster -e = "CREATE EXTERNAL TABLE foo(bar int) LOCATION 'gs://my_bucket/'" -e = "SELECT * FROM foo WHERE bar > 2" - REQUIRED FLAGS
-
- Exactly one of these must be specified:
-
--cluster=CLUSTER - The Dataproc cluster to submit the job to.
-
--cluster-labels=[KEY=VALUE,…] - List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens (
-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers.Labels of Dataproc cluster on which to place the job.
-
- Exactly one of these must be specified:
-
--execute=QUERY,-eQUERY - A Spark SQL query to execute as part of the job.
-
--file=FILE,-fFILE - HCFS URI of file containing Spark SQL script to execute as the job.
-
- Exactly one of these must be specified:
- OPTIONAL FLAGS
-
-
--async - Return immediately, without waiting for the operation in progress to complete.
-
--bucket=BUCKET - The Cloud Storage bucket to stage files in. Defaults to the cluster's configured bucket.
-
--driver-log-levels=[PACKAGE=LEVEL,…] - A list of package to log4j log level pairs to configure driver logging. For example: root=FATAL,com.example=INFO
-
--driver-required-memory-mb=DRIVER_REQUIRED_MEMORY_MB - The memory allocation requested by the job driver in megabytes (MB) for execution on the driver node group (it is used only by clusters with a driver node group).
-
--driver-required-vcores=DRIVER_REQUIRED_VCORES - The vCPU allocation requested by the job driver for execution on the driver node group (it is used only by clusters with a driver node group).
-
--jars=[JAR,…] - Comma separated list of jar files to be provided to the executor and driver classpaths. May contain UDFs.
-
--labels=[KEY=VALUE,…] - List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens (
-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers. -
--max-failures-per-hour=MAX_FAILURES_PER_HOUR - Specifies the maximum number of times a job can be restarted per hour in event of failure. Default is 0 (no retries after job failure).
-
--max-failures-total=MAX_FAILURES_TOTAL - Specifies the maximum total number of times a job can be restarted after the job fails. Default is 0 (no retries after job failure).
-
--params=[PARAM=VALUE,…] - A list of key value pairs to set variables in the Hive queries.
-
--properties=[PROPERTY=VALUE,…] - A list of key value pairs to configure Hive.
-
--properties-file=PROPERTIES_FILE - Path to a local file or a file in a Cloud Storage bucket containing
configuration properties for the job. The client machine running this command
must have read permission to the file.
Specify properties in the form of property=value in the text file. For example:
# Properties to set for the job: key1 = value1 key2 = value2 # Comment out properties not used. # key3=value3
If a property is set in both
--propertiesand--properties-file, the value defined in--propertiestakes precedence. -
--region=REGION - Dataproc region to use. Each Dataproc region constitutes an independent resource
namespace constrained to deploying instances into Compute Engine zones inside
the region. Overrides the default
dataproc/regionproperty value for this command invocation.
-
- GCLOUD WIDE FLAGS
- These flags are available to all commands:
--access-token-file,--account,--billing-project,--configuration,--flags-file,--flatten,--format,--help,--impersonate-service-account,--log-http,--project,--quiet,--trace-token,--user-output-enabled,--verbosity.Run
$ gcloud helpfor details. - NOTES
- These variants are also available:
gcloud alpha dataproc jobs submit spark-sqlgcloud beta dataproc jobs submit spark-sql
gcloud dataproc jobs submit spark-sql
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-05-27 UTC.

