To supplement the boot disk, you can attach local Solid State Drives (local SSDs) to master, primary worker, and secondary worker nodes in your cluster. When local SSDs are provided to the cluster, both HDFS and scratch data, such as shuffle outputs, use the local SSDs instead of the boot persistent disk.
- Local SSDs can provide faster read and write times than persistent disk (see Local SSD Performance ).
- The 375GB size of each local SSD is fixed, but you can attach multiple local SSDs to increase SSD storage (see About Local SSDs ).
- Each local SSD is mounted to /mnt/<id>in Dataproc cluster nodes.
- Local SSDs use  ext4as the default filesystem.
Use local SSDs
gcloud command
Use the  gcloud dataproc clusters create 
 
command with the --num-master-local-ssds 
, --num-workers-local-ssds 
, and --num-secondary-worker-local-ssds 
flags to attach local
      SSDs to the cluster's master, primary, and secondary worker
      nodes.
Local SSDs can be attached to Dataproc VMs using a SCSI
      (Small Computer System Interface) or NVME (Non-Volatile Memory Express) interface (see local SSD performance 
).
      The default Dataproc cluster VM local SSD interface is the SCSI interface. Use the gcloud dataproc clusters create 
command with the --master-local-ssd-interface 
, --worker-local-ssd-interface 
, and --secondary-worker-local-ssd-interface 
flags
      to specify the local SSD interface for master, primary, and secondary
      worker nodes.
Example:
gcloud dataproc clusters create cluster-name \ --region= region \ --num-master-local-ssds=1 \ --num-worker-local-ssds=1 \ --num-secondary-worker-local-ssds=1 \ --master-local-ssd-interface=NVME \ --worker-local-ssd-interface=NVME \ --secondary-worker-local-ssd-interface=NVME \ ... other args ...
REST API
Set the numLocalSsds 
field in the masterConfig 
, workerConfig 
, and secondaryWorkerConfig 
 InstanceGroupConfig 
in a cluster.create 
API request to attach local SSDs to the cluster's master, primary worker, and
   secondary worker nodes.
Local SSDs can be attached to Dataproc VMs using a SCSI
  (Small Computer System Interface) or NVME (Non-Volatile Memory Express) interface (see local SSD performance 
).
   The default Dataproc cluster VM local SSD interface is the SCSI interface. Set the localSsdInterface 
field in the masterConfig 
, workerConfig 
, and secondaryWorkerConfig 
 InstanceGroupConfig 
in a cluster.create 
API request to specify the "SCSI" or "NVME" interface to attach local SSDs to the cluster's master,
   primary worker, and secondary worker nodes.
Console
Create a cluster and attach local SSDs to the master, primary, and secondary worker nodes from the Configure nodes panel of the Dataproc Create a cluster page of the Google Cloud console.

