Dataproc Auto Zone placement

When you create a Dataproc cluster, cluster resources use a regional endpoints based on Compute Engine zones . When you choose a region, you can select a zone within that region, or you can omit the zone to have the Dataproc Auto Zone feature select a zone for you in the region you choose. Once a zone is selected, all nodes for that cluster will be deployed to that zone.

You can exclude zones from Auto Zone selection criteria provided that the cluster region includes at least two non-excluded zones. For more information, see Use Auto Zone placement .

Auto Zone and resource reservations

Auto Zone prioritizes creating a cluster in a zone with resource reservations , as follows:

  • If requested cluster resources can be fully satisfied by reserved, plus, if necessary, on-demand resources in a zone, Auto Zone will consume the reserved and on-demand resources, and create the cluster in that zone.

  • Auto Zone prioritizes zones for selection according to total CPU core ( vCPU ) reservations in a zone.

    Example:A cluster creation request specifies 20 n2-standard-2 and 1 n2-standard-64 (40 + 64 vCPUs requested). Auto Zone will prioritize the following zones for selection according to the total vCPU reservations available in the zone:

    1. zone-c available reservations: 3 n2-standard-2 and 1 n2-standard-64 (70 vCPUs )
    2. zone-b available reservations: 1 n2-standard-64 (64 vCPUs )
    3. zone-a available reservations: 25 n2-standard-2 (50 vCPUs )

      Assuming each of these zones has additional on-demand vCPU and other resources sufficient to satisfy the cluster request, Auto Zone will select zone-c for cluster creation.

  • If requested cluster resources cannot be fully satisfied by reserved plus on-demand resources in a zone, Auto Zone will create the cluster in a zone that is most likely to satisfy the request using on-demand resources.

Use Auto Zone placement

Console

To create a Dataproc cluster that uses Auto Zone placement:

  1. In the Google Cloud console, open the Dataproc Create a Dataproc cluster on Compute Engine page. The Set up clusterpanel is selected.
  2. In the Locationsection, do the following:
    • Select a Regionfor your cluster.
    • Under Zone, select "Any".

Exclude zones:Specifying zones to exclude from Auto Zone placement is not supported through the Google Cloud console. This feature is available using the Google Cloud CLI and the REST API.

gcloud CLI

To create a Dataproc cluster that uses Auto Zone placement, use the gcloud dataproc clusters create command. Set the --region flag to a region. then either omit the --zone flag or set the --zone flag to an empty string ( --zone="" ).

As an alternative to using the `--zone` flag, you can use use the --auto-zone-exclude-zones flag to specify a comma-separated list of zones. Auto Zone selection will select a zone from the specified region, but exclude the listed zones from its selection criteria. Note that there must be at least two non-excluded zones available in the cluster region.

Examples:

Basic Auto Zone usage:

gcloud dataproc clusters create CLUSTER_NAME 
\
    --region= REGION 
\
     other args ... 

Auto Zone with excluded zones:

gcloud dataproc clusters create CLUSTER_NAME 
\
    --region= REGION 
\
    --auto-zone-exclude-zones= ZONE_1 
, ZONE_2 
\
     other args ... 

REST API

To create a Dataproc cluster that uses Auto Zone placement, construct a JSON clusters.create API request, leaving the gceClusterConfig.zoneUri field empty. In the REST endpoint, https://dataproc.googleapis.com/v1/projects/ projectId /regions/ region /clusters , insert a region name. Dataproc Auto Zone will choose a zone for the cluster within the specified region.

To exclude specific zones, you can populate the gceClusterConfig.autoZoneExcludeZoneUris field with a list of zone names to exclude. Note that there must be at least two non-excluded zones available in the cluster region.

Use short resource names with Auto Zone placement: When specifying a resource URI, such as machineTypeUri or acceleratorTypeUri , in an Auto Zone placement REST API cluster creation request, use a short resource name without a zone specification, for example, "n1-standard-2" or "nvidia-tesla-t4".

Create a Mobile Website
View Site in Mobile | Classic
Share by: