Reference documentation and code samples for the Google Cloud Dataproc V1 Client class PySparkJob.
A Dataproc job for running Apache PySpark applications on YARN.
Generated from protobuf message google.cloud.dataproc.v1.PySparkJob
Methods
__construct
Constructor.
data
array
Optional. Data for populating the Message object.
↳ main_python_file_uri
string
Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file.
↳ args
array
Optional. The arguments to pass to the driver. Do not include arguments, such as --conf
, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
↳ python_file_uris
array
Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
↳ jar_file_uris
array
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.
↳ file_uris
array
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
↳ archive_uris
array
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
↳ properties
array| Google\Protobuf\Internal\MapField
Optional. A mapping of property names to values, used to configure PySpark. Properties that conflict with values set by the Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.
↳ logging_config
getMainPythonFileUri
Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file.
string
setMainPythonFileUri
Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file.
var
string
$this
getArgs
Optional. The arguments to pass to the driver. Do not include arguments,
such as --conf
, that can be set as job properties, since a collision may
occur that causes an incorrect job submission.
setArgs
Optional. The arguments to pass to the driver. Do not include arguments,
such as --conf
, that can be set as job properties, since a collision may
occur that causes an incorrect job submission.
var
string[]
$this
getPythonFileUris
Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
setPythonFileUris
Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
var
string[]
$this
getJarFileUris
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.
setJarFileUris
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.
var
string[]
$this
getFileUris
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
setFileUris
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
var
string[]
$this
getArchiveUris
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
setArchiveUris
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
var
string[]
$this
getProperties
Optional. A mapping of property names to values, used to configure PySpark.
Properties that conflict with values set by the Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.
setProperties
Optional. A mapping of property names to values, used to configure PySpark.
Properties that conflict with values set by the Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.
$this
getLoggingConfig
Optional. The runtime log config for job execution.
hasLoggingConfig
clearLoggingConfig
setLoggingConfig
Optional. The runtime log config for job execution.
$this