- JSON representation
- DiskConfig
- Preemptibility
- ManagedGroupConfig
- AcceleratorConfig
- InstanceFlexibilityPolicy
- InstanceSelection
- InstanceSelectionResult
- StartupConfig
The config settings for Compute Engine resources in an instance group, such as a master or worker group.
JSON representation |
---|
{ "numInstances" : integer , "instanceNames" : [ string ] , "imageUri" : string , "machineTypeUri" : string , "diskConfig" : { object ( |
numInstances
integer
Optional. The number of VM instances in the instance group. For HA cluster masterConfig groups, must be set to 3. For standard cluster masterConfig groups, must be set to 1.
instanceNames[]
string
Output only. The list of instance names. Dataproc derives the names from clusterName
, numInstances
, and the instance group.
imageUri
string
Optional. The Compute Engine image resource used for cluster instances.
The URI can represent an image or image family.
Image examples:
-
https://www.googleapis.com/compute/v1/projects/[projectId]/global/images/[image-id]
-
projects/[projectId]/global/images/[image-id]
-
image-id
Image family examples. Dataproc will use the most recent image from the family:
-
https://www.googleapis.com/compute/v1/projects/[projectId]/global/images/family/[custom-image-family-name]
-
projects/[projectId]/global/images/family/[custom-image-family-name]
If the URI is unspecified, it will be inferred from SoftwareConfig.image_version
or the system default.
machineTypeUri
string
Optional. The Compute Engine machine type used for cluster instances.
A full URL, partial URI, or short name are valid. Examples:
-
https://www.googleapis.com/compute/v1/projects/[projectId]/zones/[zone]/machineTypes/n1-standard-2
-
projects/[projectId]/zones/[zone]/machineTypes/n1-standard-2
-
n1-standard-2
Auto Zone Exception: If you are using the Dataproc Auto Zone Placement
feature, you must use the short name of the machine type resource, for example, n1-standard-2
.
diskConfig
object (
DiskConfig
)
Optional. Disk option config settings.
isPreemptible
boolean
Output only. Specifies that this instance group contains preemptible instances.
preemptibility
enum (
Preemptibility
)
Optional. Specifies the preemptibility of the instance group.
The default value for master and worker groups is NON_PREEMPTIBLE
. This default cannot be changed.
The default value for secondary instances is PREEMPTIBLE
.
managedGroupConfig
object (
ManagedGroupConfig
)
Output only. The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.
accelerators[]
object (
AcceleratorConfig
)
Optional. The Compute Engine accelerator configuration for these instances.
minCpuPlatform
string
Optional. Specifies the minimum cpu platform for the Instance Group. See Dataproc -> Minimum CPU Platform .
minNumInstances
integer
Optional. The minimum number of primary worker instances to create. If minNumInstances
is set, cluster creation will succeed if the number of primary workers created is at least equal to the minNumInstances
number.
Example: Cluster creation request with numInstances
= 5
and minNumInstances
= 3
:
- If 4 VMs are created and 1 instance fails, the failed VM is deleted. The cluster is resized to 4 instances and placed in a
RUNNING
state. - If 2 instances are created and 3 instances fail, the cluster in placed in an
ERROR
state. The failed VMs are not deleted.
instanceFlexibilityPolicy
object (
InstanceFlexibilityPolicy
)
Optional. Instance flexibility Policy allowing a mixture of VM shapes and provisioning models.
startupConfig
object (
StartupConfig
)
Optional. Configuration to handle the startup of instances during cluster create and update process.
DiskConfig
Specifies the config of disk options for a group of VM instances.
JSON representation |
---|
{ "bootDiskType" : string , "bootDiskSizeGb" : integer , "numLocalSsds" : integer , "localSsdInterface" : string } |
Fields | |
---|---|
bootDiskType
|
Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-balanced" (Persistent Disk Balanced Solid State Drive), "pd-ssd" (Persistent Disk Solid State Drive), or "pd-standard" (Persistent Disk Hard Disk Drive). See Disk types . |
bootDiskSizeGb
|
Optional. Size in GB of the boot disk (default is 500GB). |
numLocalSsds
|
Optional. Number of attached SSDs, from 0 to 8 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries. Note: Local SSD options may vary by machine type and number of vCPUs selected. |
localSsdInterface
|
Optional. Interface type of local SSDs (default is "scsi"). Valid values: "scsi" (Small Computer System Interface), "nvme" (Non-Volatile Memory Express). See local SSD performance . |
Preemptibility
Controls the use of preemptible instances within the group.
Enums | |
---|---|
PREEMPTIBILITY_UNSPECIFIED
|
Preemptibility is unspecified, the system will choose the appropriate setting for each instance group. |
NON_PREEMPTIBLE
|
Instances are non-preemptible. This option is allowed for all instance groups and is the only valid value for Master and Worker instance groups. |
PREEMPTIBLE
|
Instances are preemptible . This option is allowed only for secondary worker groups. |
SPOT
|
Instances are Spot VMs . This option is allowed only for secondary worker groups. Spot VMs are the latest version of preemptible VMs , and provide additional features. |
ManagedGroupConfig
Specifies the resources used to actively manage an instance group.
JSON representation |
---|
{ "instanceTemplateName" : string , "instanceGroupManagerName" : string , "instanceGroupManagerUri" : string } |
Fields | |
---|---|
instanceTemplateName
|
Output only. The name of the Instance Template used for the Managed Instance Group. |
instanceGroupManagerName
|
Output only. The name of the Instance Group Manager for this group. |
instanceGroupManagerUri
|
Output only. The partial URI to the instance group manager for this group. E.g. projects/my-project/regions/us-central1/instanceGroupManagers/my-igm. |
AcceleratorConfig
Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine .
JSON representation |
---|
{ "acceleratorTypeUri" : string , "acceleratorCount" : integer } |
acceleratorTypeUri
string
Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes .
Examples:
-
https://www.googleapis.com/compute/v1/projects/[projectId]/zones/[zone]/acceleratorTypes/nvidia-tesla-t4
-
projects/[projectId]/zones/[zone]/acceleratorTypes/nvidia-tesla-t4
-
nvidia-tesla-t4
Auto Zone Exception: If you are using the Dataproc Auto Zone Placement
feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-t4
.
acceleratorCount
integer
The number of the accelerator cards of this type exposed to this instance.
InstanceFlexibilityPolicy
Instance flexibility Policy allowing a mixture of VM shapes and provisioning models.
JSON representation |
---|
{ "instanceSelectionList" : [ { object ( |
Fields | |
---|---|
instanceSelectionList[]
|
Optional. List of instance selection options that the group will use when creating new VMs. |
instanceSelectionResults[]
|
Output only. A list of instance selection results in the group. |
InstanceSelection
Defines machines types and a rank to which the machines types belong.
JSON representation |
---|
{ "machineTypes" : [ string ] , "rank" : integer } |
Fields | |
---|---|
machineTypes[]
|
Optional. Full machine-type names, e.g. "n1-standard-16". |
rank
|
Optional. Preference of this instance selection. Lower number means higher preference. Dataproc will first try to create a VM based on the machine-type with priority rank and fallback to next rank based on availability. Machine types and instance selections with the same priority have the same preference. |
InstanceSelectionResult
Defines a mapping from machine types to the number of VMs that are created with each machine type.
JSON representation |
---|
{ "machineType" : string , "vmCount" : integer } |
Fields | |
---|---|
machineType
|
Output only. Full machine-type names, e.g. "n1-standard-16". |
vmCount
|
Output only. Number of VM provisioned with the machineType. |
StartupConfig
Configuration to handle the startup of instances during cluster create and update process.
JSON representation |
---|
{ "requiredRegistrationFraction" : number } |
Fields | |
---|---|
requiredRegistrationFraction
|
Optional. The config setting to enable cluster creation/ updation to be successful only after requiredRegistrationFraction of instances are up and running. This configuration is applicable to only secondary workers for now. The cluster will fail if requiredRegistrationFraction of instances are not available. This will include instance creation, agent registration, and service registration (if enabled). |