- NAME
-
- gcloud dataproc jobs submit spark-r - submit a SparkR job to a cluster
- SYNOPSIS
-
-
gcloud dataproc jobs submit spark-rR_FILE(--cluster=CLUSTER|--cluster-labels=[KEY=VALUE, …]) [--archives=[ARCHIVE, …]] [--async] [--bucket=BUCKET] [--driver-log-levels=[PACKAGE=LEVEL, …]] [--driver-required-memory-mb=DRIVER_REQUIRED_MEMORY_MB] [--driver-required-vcores=DRIVER_REQUIRED_VCORES] [--files=[FILE, …]] [--labels=[KEY=VALUE, …]] [--max-failures-per-hour=MAX_FAILURES_PER_HOUR] [--max-failures-total=MAX_FAILURES_TOTAL] [--properties=[PROPERTY=VALUE, …]] [--properties-file=PROPERTIES_FILE] [--region=REGION] [GCLOUD_WIDE_FLAG …] [--JOB_ARGS…]
-
- DESCRIPTION
- Submit a SparkR job to a cluster.
- EXAMPLES
- To submit a SparkR job with a local script, run:
gcloud dataproc jobs submit spark-r --cluster = my-cluster my_script.RTo submit a Spark job that runs a script already on the cluster, run:
gcloud dataproc jobs submit spark-r --cluster = my-cluster file:///…/my_script.R -- gs://my_bucket/data.csv - POSITIONAL ARGUMENTS
-
-
R_FILE - Main .R file to run as the driver.
- [--
JOB_ARGS…] - Arguments to pass to the driver.
The '--' argument must be specified between gcloud specific args on the left and JOB_ARGS on the right.
-
- REQUIRED FLAGS
-
- Exactly one of these must be specified:
-
--cluster=CLUSTER - The Dataproc cluster to submit the job to.
-
--cluster-labels=[KEY=VALUE,…] - List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens (
-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers.Labels of Dataproc cluster on which to place the job.
-
- Exactly one of these must be specified:
- OPTIONAL FLAGS
-
-
--archives=[ARCHIVE,…] - Comma separated list of archives to be extracted into the working directory of each executor. Must be one of the following file formats: .zip, .tar, .tar.gz, or .tgz.
-
--async - Return immediately, without waiting for the operation in progress to complete.
-
--bucket=BUCKET - The Cloud Storage bucket to stage files in. Defaults to the cluster's configured bucket.
-
--driver-log-levels=[PACKAGE=LEVEL,…] - List of key value pairs to configure driver logging, where key is a package and value is the log4j log level. For example: root=FATAL,com.example=INFO
-
--driver-required-memory-mb=DRIVER_REQUIRED_MEMORY_MB - The memory allocation requested by the job driver in megabytes (MB) for execution on the driver node group (it is used only by clusters with a driver node group).
-
--driver-required-vcores=DRIVER_REQUIRED_VCORES - The vCPU allocation requested by the job driver for execution on the driver node group (it is used only by clusters with a driver node group).
-
--files=[FILE,…] - Comma separated list of files to be placed in the working directory of both the app driver and executors.
-
--labels=[KEY=VALUE,…] - List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens (
-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers. -
--max-failures-per-hour=MAX_FAILURES_PER_HOUR - Specifies the maximum number of times a job can be restarted per hour in event of failure. Default is 0 (no retries after job failure).
-
--max-failures-total=MAX_FAILURES_TOTAL - Specifies the maximum total number of times a job can be restarted after the job fails. Default is 0 (no retries after job failure).
-
--properties=[PROPERTY=VALUE,…] - List of key value pairs to configure SparkR. For a list of available properties, see: https://spark.apache.org/docs/latest/configuration.html#available-properties .
-
--properties-file=PROPERTIES_FILE - Path to a local file or a file in a Cloud Storage bucket containing
configuration properties for the job. The client machine running this command
must have read permission to the file.
Specify properties in the form of property=value in the text file. For example:
# Properties to set for the job: key1 = value1 key2 = value2 # Comment out properties not used. # key3=value3
If a property is set in both
--propertiesand--properties-file, the value defined in--propertiestakes precedence. -
--region=REGION - Dataproc region to use. Each Dataproc region constitutes an independent resource
namespace constrained to deploying instances into Compute Engine zones inside
the region. Overrides the default
dataproc/regionproperty value for this command invocation.
-
- GCLOUD WIDE FLAGS
- These flags are available to all commands:
--access-token-file,--account,--billing-project,--configuration,--flags-file,--flatten,--format,--help,--impersonate-service-account,--log-http,--project,--quiet,--trace-token,--user-output-enabled,--verbosity.Run
$ gcloud helpfor details. - NOTES
- These variants are also available:
gcloud alpha dataproc jobs submit spark-rgcloud beta dataproc jobs submit spark-r
gcloud dataproc jobs submit spark-r
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-05-27 UTC.

