gcloud dataproc jobs submit spark-sql

NAME: gcloud dataproc jobs submit spark-sql - submit a Spark SQL job to a cluster
SYNOPSIS: gcloud dataproc jobs submit spark-sql ( --cluster = CLUSTER | --cluster-labels =[ KEY = VALUE , …]) ( --execute = QUERY , -e QUERY | --file = FILE , -f FILE ) [ --async ] [ --bucket = BUCKET ] [ --driver-log-levels =[ PACKAGE = LEVEL , …]] [ --driver-required-memory-mb = DRIVER_REQUIRED_MEMORY_MB ] [ --driver-required-vcores = DRIVER_REQUIRED_VCORES ] [ --jars =[ JAR , …]] [ --labels =[ KEY = VALUE , …]] [ --max-failures-per-hour = MAX_FAILURES_PER_HOUR ] [ --max-failures-total = MAX_FAILURES_TOTAL ] [ --params =[ PARAM = VALUE , …]] [ --properties =[ PROPERTY = VALUE , …]] [ --properties-file = PROPERTIES_FILE ] [ --region = REGION ] [ GCLOUD_WIDE_FLAG … ]
DESCRIPTION: Submit a Spark SQL job to a cluster.
EXAMPLES: To submit a Spark SQL job with a local script, run:
gcloud dataproc jobs submit spark-sql --cluster = my-cluster --file = my_queries.ql

To submit a Spark SQL job with inline queries, run:

gcloud dataproc jobs submit spark-sql --cluster = my-cluster -e = "CREATE EXTERNAL TABLE foo(bar int) LOCATION 'gs://my_bucket/'" -e = "SELECT * FROM foo WHERE bar > 2"
REQUIRED FLAGS: Exactly one of these must be specified:

--cluster = CLUSTER

The Dataproc cluster to submit the job to.

--cluster-labels =[ KEY = VALUE ,…]

List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens ( - ), underscores ( _ ), lowercase characters, and numbers. Values must contain only hyphens ( - ), underscores ( _ ), lowercase characters, and numbers.

Labels of Dataproc cluster on which to place the job.

Exactly one of these must be specified:

--execute = QUERY , -e QUERY

A Spark SQL query to execute as part of the job.

--file = FILE , -f FILE

HCFS URI of file containing Spark SQL script to execute as the job.
OPTIONAL FLAGS: --async

Return immediately, without waiting for the operation in progress to complete.

--bucket = BUCKET

The Cloud Storage bucket to stage files in. Defaults to the cluster's configured bucket.

--driver-log-levels =[ PACKAGE = LEVEL ,…]

A list of package to log4j log level pairs to configure driver logging. For example: root=FATAL,com.example=INFO

--driver-required-memory-mb = DRIVER_REQUIRED_MEMORY_MB

The memory allocation requested by the job driver in megabytes (MB) for execution on the driver node group (it is used only by clusters with a driver node group).

--driver-required-vcores = DRIVER_REQUIRED_VCORES

The vCPU allocation requested by the job driver for execution on the driver node group (it is used only by clusters with a driver node group).

--jars =[ JAR ,…]

Comma separated list of jar files to be provided to the executor and driver classpaths. May contain UDFs.

--labels =[ KEY = VALUE ,…]

List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens ( - ), underscores ( _ ), lowercase characters, and numbers. Values must contain only hyphens ( - ), underscores ( _ ), lowercase characters, and numbers.

--max-failures-per-hour = MAX_FAILURES_PER_HOUR

Specifies the maximum number of times a job can be restarted per hour in event of failure. Default is 0 (no retries after job failure).

--max-failures-total = MAX_FAILURES_TOTAL

Specifies the maximum total number of times a job can be restarted after the job fails. Default is 0 (no retries after job failure).

--params =[ PARAM = VALUE ,…]

A list of key value pairs to set variables in the Hive queries.

--properties =[ PROPERTY = VALUE ,…]

A list of key value pairs to configure Hive.

--properties-file = PROPERTIES_FILE

Path to a local file or a file in a Cloud Storage bucket containing configuration properties for the job. The client machine running this command must have read permission to the file.
Specify properties in the form of property=value in the text file. For example:

# Properties to set for the job: key1 = value1 key2 = value2 # Comment out properties not used. # key3=value3

If a property is set in both --properties and --properties-file , the value defined in --properties takes precedence.

--region = REGION

Dataproc region to use. Each Dataproc region constitutes an independent resource namespace constrained to deploying instances into Compute Engine zones inside the region. Overrides the default dataproc/region property value for this command invocation.
GCLOUD WIDE FLAGS: These flags are available to all commands: --access-token-file , --account , --billing-project , --configuration , --flags-file , --flatten , --format , --help , --impersonate-service-account , --log-http , --project , --quiet , --trace-token , --user-output-enabled , --verbosity .
Run $ gcloud help for details.
NOTES: These variants are also available:
gcloud alpha dataproc jobs submit spark-sql

gcloud beta dataproc jobs submit spark-sql

gcloud dataproc jobs submit spark-sql Stay organized with collections Save and categorize content based on your preferences.

gcloud dataproc jobs submit spark-sql