Reference documentation and code samples for the Google Cloud Dataproc V1 Client class PySparkJob.
A Dataproc job for runningApache
PySparkapplications on YARN.
Generated from protobuf messagegoogle.cloud.dataproc.v1.PySparkJob
Namespace
Google \ Cloud \ Dataproc \ V1
Methods
__construct
Constructor.
Parameters
Name
Description
data
array
Optional. Data for populating the Message object.
↳ main_python_file_uri
string
Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file.
↳ args
array
Optional. The arguments to pass to the driver. Do not include arguments, such as--conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
↳ python_file_uris
array
Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
↳ jar_file_uris
array
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.
↳ file_uris
array
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
↳ archive_uris
array
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
Optional. A mapping of property names to values, used to configure PySpark. Properties that conflict with values set by the Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.
Optional. The runtime log config for job execution.
getMainPythonFileUri
Required. The HCFS URI of the main Python file to use as the driver. Must
be a .py file.
Returns
Type
Description
string
setMainPythonFileUri
Required. The HCFS URI of the main Python file to use as the driver. Must
be a .py file.
Parameter
Name
Description
var
string
Returns
Type
Description
$this
getArgs
Optional. The arguments to pass to the driver. Do not include arguments,
such as--conf, that can be set as job properties, since a collision may
occur that causes an incorrect job submission.
Optional. The arguments to pass to the driver. Do not include arguments,
such as--conf, that can be set as job properties, since a collision may
occur that causes an incorrect job submission.
Parameter
Name
Description
var
string[]
Returns
Type
Description
$this
getPythonFileUris
Optional. HCFS file URIs of Python files to pass to the PySpark
framework. Supported file types: .py, .egg, and .zip.
Optional. HCFS URIs of files to be placed in the working directory of
each executor. Useful for naively parallel tasks.
Parameter
Name
Description
var
string[]
Returns
Type
Description
$this
getArchiveUris
Optional. HCFS URIs of archives to be extracted into the working directory
of each executor. Supported file types:
.jar, .tar, .tar.gz, .tgz, and .zip.
Optional. HCFS URIs of archives to be extracted into the working directory
of each executor. Supported file types:
.jar, .tar, .tar.gz, .tgz, and .zip.
Parameter
Name
Description
var
string[]
Returns
Type
Description
$this
getProperties
Optional. A mapping of property names to values, used to configure PySpark.
Properties that conflict with values set by the Dataproc API may be
overwritten. Can include properties set in
/etc/spark/conf/spark-defaults.conf and classes in user code.
Optional. A mapping of property names to values, used to configure PySpark.
Properties that conflict with values set by the Dataproc API may be
overwritten. Can include properties set in
/etc/spark/conf/spark-defaults.conf and classes in user code.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[],[],null,["# Google Cloud Dataproc V1 Client - Class PySparkJob (3.14.0)\n\nVersion latestkeyboard_arrow_down\n\n- [3.14.0 (latest)](/php/docs/reference/cloud-dataproc/latest/V1.PySparkJob)\n- [3.13.4](/php/docs/reference/cloud-dataproc/3.13.4/V1.PySparkJob)\n- [3.12.0](/php/docs/reference/cloud-dataproc/3.12.0/V1.PySparkJob)\n- [3.11.0](/php/docs/reference/cloud-dataproc/3.11.0/V1.PySparkJob)\n- [3.10.1](/php/docs/reference/cloud-dataproc/3.10.1/V1.PySparkJob)\n- [3.9.0](/php/docs/reference/cloud-dataproc/3.9.0/V1.PySparkJob)\n- [3.8.1](/php/docs/reference/cloud-dataproc/3.8.1/V1.PySparkJob)\n- [3.7.1](/php/docs/reference/cloud-dataproc/3.7.1/V1.PySparkJob)\n- [3.6.1](/php/docs/reference/cloud-dataproc/3.6.1/V1.PySparkJob)\n- [3.5.1](/php/docs/reference/cloud-dataproc/3.5.1/V1.PySparkJob)\n- [3.4.0](/php/docs/reference/cloud-dataproc/3.4.0/V1.PySparkJob)\n- [3.3.0](/php/docs/reference/cloud-dataproc/3.3.0/V1.PySparkJob)\n- [3.2.2](/php/docs/reference/cloud-dataproc/3.2.2/V1.PySparkJob)\n- [2.6.1](/php/docs/reference/cloud-dataproc/2.6.1/V1.PySparkJob)\n- [2.5.0](/php/docs/reference/cloud-dataproc/2.5.0/V1.PySparkJob)\n- [2.3.0](/php/docs/reference/cloud-dataproc/2.3.0/V1.PySparkJob)\n- [2.2.3](/php/docs/reference/cloud-dataproc/2.2.3/V1.PySparkJob)\n- [2.1.0](/php/docs/reference/cloud-dataproc/2.1.0/V1.PySparkJob)\n- [2.0.0](/php/docs/reference/cloud-dataproc/2.0.0/V1.PySparkJob) \nReference documentation and code samples for the Google Cloud Dataproc V1 Client class PySparkJob.\n\nA Dataproc job for running\n[Apache\nPySpark](https://spark.apache.org/docs/0.9.0/python-programming-guide.html)\napplications on YARN.\n\nGenerated from protobuf message `google.cloud.dataproc.v1.PySparkJob`\n\nNamespace\n---------\n\nGoogle \\\\ Cloud \\\\ Dataproc \\\\ V1\n\nMethods\n-------\n\n### __construct\n\nConstructor.\n\n### getMainPythonFileUri\n\nRequired. The HCFS URI of the main Python file to use as the driver. Must\nbe a .py file.\n\n### setMainPythonFileUri\n\nRequired. The HCFS URI of the main Python file to use as the driver. Must\nbe a .py file.\n\n### getArgs\n\nOptional. The arguments to pass to the driver. Do not include arguments,\nsuch as `--conf`, that can be set as job properties, since a collision may\noccur that causes an incorrect job submission.\n\n### setArgs\n\nOptional. The arguments to pass to the driver. Do not include arguments,\nsuch as `--conf`, that can be set as job properties, since a collision may\noccur that causes an incorrect job submission.\n\n### getPythonFileUris\n\nOptional. HCFS file URIs of Python files to pass to the PySpark\nframework. Supported file types: .py, .egg, and .zip.\n\n### setPythonFileUris\n\nOptional. HCFS file URIs of Python files to pass to the PySpark\nframework. Supported file types: .py, .egg, and .zip.\n\n### getJarFileUris\n\nOptional. HCFS URIs of jar files to add to the CLASSPATHs of the\nPython driver and tasks.\n\n### setJarFileUris\n\nOptional. HCFS URIs of jar files to add to the CLASSPATHs of the\nPython driver and tasks.\n\n### getFileUris\n\nOptional. HCFS URIs of files to be placed in the working directory of\neach executor. Useful for naively parallel tasks.\n\n### setFileUris\n\nOptional. HCFS URIs of files to be placed in the working directory of\neach executor. Useful for naively parallel tasks.\n\n### getArchiveUris\n\nOptional. HCFS URIs of archives to be extracted into the working directory\nof each executor. Supported file types:\n.jar, .tar, .tar.gz, .tgz, and .zip.\n\n### setArchiveUris\n\nOptional. HCFS URIs of archives to be extracted into the working directory\nof each executor. Supported file types:\n.jar, .tar, .tar.gz, .tgz, and .zip.\n\n### getProperties\n\nOptional. A mapping of property names to values, used to configure PySpark.\n\nProperties that conflict with values set by the Dataproc API may be\noverwritten. Can include properties set in\n/etc/spark/conf/spark-defaults.conf and classes in user code.\n\n### setProperties\n\nOptional. A mapping of property names to values, used to configure PySpark.\n\nProperties that conflict with values set by the Dataproc API may be\noverwritten. Can include properties set in\n/etc/spark/conf/spark-defaults.conf and classes in user code.\n\n### getLoggingConfig\n\nOptional. The runtime log config for job execution.\n\n### hasLoggingConfig\n\n### clearLoggingConfig\n\n### setLoggingConfig\n\nOptional. The runtime log config for job execution."]]