Set VM host maintenance policy


This document describes how to set a virtual machine (VM) instance's host maintenance policy to control how the VM behaves when a host event occurs.

Before you begin

  • If you haven't already, set up authentication. Authentication is the process by which your identity is verified for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine as follows.

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    gcloud

    1. Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud  
      init
    2. Set a default region and zone .

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud  
      init

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Limitations

  • You can't change the host maintenance policy of a preemptible VM. When there is a maintenance event, the preemptible VM stops and it does not migrate. You must manually restart the preempted VM.
  • After you create a VM using an E2 machine type, you can't change the VM's host maintenance settings from MIGRATE to TERMINATE or the other way around.

Available host maintenance properties

You can configure a VM's maintenance behavior, restart behavior, and behavior after a host error occurs with the following properties.

Compute Engine configures each VM with the default values unless you specify otherwise.

During host events, depending on the configured host maintenance policy, VMs that don't support live migration are terminated or automatically restarted.

  • onHostMaintenance : determines the behavior when a maintenance event occurs that might cause your VM to reboot.

    • MIGRATE ( Default): causes Compute Engine to live migrate an instance when there is a maintenance event.
    • TERMINATE : stops a VM instead of migrating it.
  • automaticRestart : determines the behavior when a VM crashes or is stopped by the system.

    • true ( Default): Compute Engine restarts an instance if the instance crashes or is stopped.
    • false : Compute Engine does not restart a VM if the VM crashes or is stopped.
  • localSsdRecoveryTimeout : Sets the Local SSD recovery timeout. This is the maximum amount of time, in hours, that Compute Engine waits to recover Local SSD data after a host error. This setting only applies to VMs with attached Local SSD disks.

    • Unset ( Default): Compute Engine waits up to 1 hour to recover the disk. For Z3 VMs, the default wait time is 6 hours.
    • A number from 0 to 168: specifies how long Compute Engine waits to recover the disk. The number is must be an integer, in increments of 1 hour, with a maximum value of 7 days. A value of 0 means that Compute Engine won't wait to recover the data.
  • hostErrorTimeoutSeconds ( Preview ): Sets the maximum amount of time, in seconds, that Compute Engine waits to restart or terminate a VM after detecting that the VM is unresponsive.

    • Unset ( Default): Compute Engine waits up to 5.5 minutes (330 seconds) before restarting an unresponsive VM.
    • Number from 90 to 330: specifies the number of seconds, in increments of 30, that Compute Engine waits before restarting an unresponsive VM.

Set host maintenance policy of a VM

You can change the host maintenance policy of a VM when you first create the VM or after the VM is created .

Set host maintenance policy during VM creation

The information in this section focuses on how to set the host maintenance policy when you create a VM. For more VM creation examples, see Create and start a VM instance .

You can set the host maintenance policy of a VM at creation using the Google Cloud console, gcloud CLI or the Compute Engine API.

Console

  1. In the Google Cloud console, go to the Create an instancepage.

    Go to Create an instance

  2. Specify a Namefor the VM.

  3. Select a Regionand Zonefor the VM.

  4. In the Machine configurationsection, do the following:

    1. Specify the details of the machine type for the VM.
    2. Expand the VM provisioning model advanced settingsmenu.
    3. In the On host maintenancemenu, select one of the following steps:
    4. To migrate VMs during maintenance events, select Migrate VM instance.
    5. To stop VMs during maintenance events, select Terminate VM instance.
  5. To create the VM, click Create.

gcloud

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

To set the host maintenance policy of a new VM, use the gcloud compute instances create command. Include one or more of the following parameters:

  • --maintenance-policy : whether the VM is migrated or stopped during host maintenance. The VM is migrated by default if you omit this property.
  • --no-restart-on-failure or --restart-on-failure : whether the VM restarts automatically after a host error. By default, the VM will always restart when a failure is detected.
  • --local-ssd-recovery-timeout : how much time Compute Engine spends recovering any attached Local SSD disks after a host error. The default is 1 hour.

Set the host maintenance policy of a new VM with the following command. If you omit any of the flags, the flag's default is used.

 gcloud compute instances create VM_NAME 
\
      --maintenance-policy= MAINTENANCE_POLICY 
\
      -- RESTART_ON_FAILURE_BEHAVIOR 
\
      --local-ssd-recovery-timeout= SSD_RECOVERY_TIMEOUT 
 

Replace the following:

  • VM_NAME : the VM name.
  • MAINTENANCE_POLICY : the maintenance policy for this VM, either TERMINATE or MIGRATE .
  • RESTART_ON_FAILURE_BEHAVIOR : Restart behaviour for the VM, set to either --no-restart-on-failure or --restart-on-failure .
  • SSD_RECOVERY_TIMEOUT : the number of hours to spend recovering a Local SSD attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.

Set the host error detection timeout

Preview — --host-error-timeout-seconds

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

To set the maximum amount of time Compute Engine waits to restart or terminate an unresponsive VM, use the gcloud compute instances create command. Specify the timeout with the --host-error-timeout-seconds flag.

 gcloud beta compute instances create VM_NAME 
\
      --maintenance-policy= MAINTENANCE_POLICY 
\
      -- RESTART_ON_FAILURE_BEHAVIOR 
\
      --local-ssd-recovery-timeout= SSD_RECOVERY_TIMEOUT 
\
      --host-error-timeout-seconds= ERROR_DETECTION_TIMEOUT 
 

Replace the following:

  • VM_NAME : the VM name.
  • MAINTENANCE_POLICY : the maintenance policy for this VM, either TERMINATE or MIGRATE .
  • RESTART_ON_FAILURE_BEHAVIOR : Restart behaviour for the VM, set to either --no-restart-on-failure or --restart-on-failure .
  • SSD_RECOVERY_TIMEOUT : the number of hours Compute Engine spends recovering a Local SSD that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.
  • ERROR_DETECTION_TIMEOUT : the number of seconds Compute Engine waits before restarting an unresponsive VM, from 90 to 330, in increments of 30.

REST

To set the host maintenance policy of a new VM using the Compute Engine API, use the instances.insert method . Include one or more of the following properties in the scheduling object of the request body:

  • onHostMaintenance : whether the VM is migrated or stopped during host maintenance. The VM is migrated by default.
  • automaticRestart : whether the VM restarts automatically after a host error. VMs are restarted automatically by default.
  • localSsdRecoveryTimeout : how much time Compute Engine spends recovering any attached Local SSD disks after detecting a host error. The default is 1 hour.
 POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID 
/zones/ ZONE 
/instances

      {
        "name": " VM_NAME 
",

        "scheduling": {
          "onHostMaintenance": " MAINTENANCE_POLICY 
",
          "automaticRestart": " RESTART_POLICY 
,
          "localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT 
}
      } 

Replace the following:

  • PROJECT_ID : the project for the VM.
  • ZONE : the zone where you want to create the VM.
  • VM_NAME : the VM name.
  • MAINTENANCE_POLICY : the maintenance policy for this VM, either TERMINATE or MIGRATE .
  • RESTART_POLICY : the restart policy for this VM, either true or false .
  • SSD_RECOVERY_TIMEOUT : the number of hours Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.

Set the host error detection timeout

Preview — hostErrorTimeoutSeconds

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

To set the maximum amount of time Compute Engine waits to restart or terminate an unresponsive VM, use the beta instances.insert method because this option is available in Preview .

Add the hostErrorTimeoutSeconds property to the scheduling object of the request body.

 POST https://compute.googleapis.com/compute/beta/projects/ PROJECT_ID 
/zones/ ZONE 
/instances

   {
      "name": " VM_NAME 
",

      "scheduling": {
        "onHostMaintenance": " MAINTENANCE_POLICY 
",
        "automaticRestart": " RESTART_POLICY 
,
        "localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT 
"hostErrorTimeoutSeconds": HOST_ERROR_TIMEOUT 
,
      }
    } 

Replace the following:

  • PROJECT_ID : the project for the VM.
  • ZONE : the zone where you want to create the VM.
  • VM_NAME : the VM name.
  • MAINTENANCE_POLICY : the maintenance policy for this VM, either TERMINATE or MIGRATE .
  • RESTART_POLICY : the restart policy for this VM, either true or false .
  • SSD_RECOVERY_TIMEOUT : the number of hours Compute Engine to spend recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.
  • HOST_ERROR_TIMEOUT : the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM. Valid values are from 90 to 330, in increments of 30.

Update the host maintenance policy of an existing VM

Console

  1. In the Google Cloud console, go to the VM instancespage.

    Go to VM instances

  2. Click the VM for which you want to change settings. The VM details page displays.

  3. On the VM details page, complete the following steps:

    1. Click the Editbutton at the top of the page.
    2. Go to the Managementsection. From the Availability policiessection, you can set the On host maintenanceand Automatic restartoptions.
    3. Click Save.

gcloud

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

Update the host maintenance policy of an existing VM with the gcloud compute instances set-scheduling command . Use the same parameters described in the VM creation command in the preceding section.

 gcloud compute instances set-scheduling VM_NAME 
\
      --maintenance-policy= MAINTENANCE_POLICY 
\
      -- RESTART_ON_FAILURE_BEHAVIOR 
\
      --local-ssd-recovery-timeout= SSD_RECOVERY_TIMEOUT 
 

Replace the following:

  • VM_NAME : the VM name.
  • MAINTENANCE_POLICY : the policy for this VM, either TERMINATE or MIGRATE .
  • RESTART_ON_FAILURE_BEHAVIOR : restart behaviour for the VM, either --no-restart-on-failure or --restart-on-failure .
  • SSD_RECOVERY_TIMEOUT : the time, in hours, Compute Engine spends recovering a Local SSD disk attached to an unresponsive VM. Valid values are from 0 to 168.

Update the host error detection timeout

Preview — --host-error-timeout-seconds

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

To update the maximum amount of time Compute Engine waits to restart or terminate an unresponsive VM, use the gcloud beta compute instances set-scheduling command, because this feature is only available in Preview .

Update the timeout with the --host-error-timeout-seconds parameter. For example:

 gcloud beta compute instances set-scheduling VM_NAME 
\
      --maintenance-policy= MAINTENANCE_POLICY 
\
      -- RESTART_ON_FAILURE_BEHAVIOR 
\
      --local-ssd-recovery-timeout= SSD_RECOVERY_TIMEOUT 
\
      --host-error-timeout-seconds= NUMBER_OF_SECONDS 
 

Replace the following:

  • VM_NAME : the VM name.
  • MAINTENANCE_POLICY : the maintenance policy for this VM, either TERMINATE or MIGRATE .
  • RESTART_ON_FAILURE_BEHAVIOR : Restart behaviour for the VM, set to either --no-restart-on-failure or --restart-on-failure .
  • SSD_RECOVERY_TIMEOUT : the time, in hours, Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168.
  • NUMBER_OF_SECONDS : the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM, from 90 to 330, in increments of 30.

REST

Update the host maintenance policy of an existing VM with a POST request to the instances.setScheduling method .

Include one or more of the following properties in the request body:

  • onHostMaintenance : whether the VM is migrated or stopped during host maintenance. The VM is migrated by default.
  • automaticRestart : whether the VM restarts automatically after a host error. VMs are restarted automatically by default.
  • localSsdRecoveryTimeout : how much time Compute Engine spends recovering any attached Local SSD disks after detecting a host error. If omitted, the default is 1 hour.
 POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID 
/zones/ ZONE 
/instances/ VM_NAME 
/setScheduling

    {
      "onHostMaintenance": " MAINTENANCE_POLICY 
",
      "automaticRestart": RESTART_POLICY 
,
      "localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT 
} 

Replace the following:

  • PROJECT_ID : the project for the VM.
  • ZONE : the zone where the VM is located.
  • VM_NAME : the VM name.
  • MAINTENANCE_POLICY : the maintenance policy for this VM, either TERMINATE or MIGRATE .
  • RESTART_POLICY : the restart policy for this VM, either true or false .
  • SSD_RECOVERY_TIMEOUT : the time, in hours, that Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168.

Update the host error detection timeout

Preview — hostErrorTimeoutSeconds

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

To update the maximum amount of time Compute Engine waits to restart or terminate an unresponsive VM, you must use the beta instances.setScheduling method because this feature is available in Preview .

Add the hostErrorTimeoutSeconds parameter to the request body.

 POST https://compute.googleapis.com/compute/beta/projects/ PROJECT_ID 
/zones/ ZONE 
/instances/ VM_NAME 
/setScheduling

  {
    "hostErrorTimeoutSeconds": NUMBER_OF_SECONDS 
,
  } 

Replace the following:

  • PROJECT_ID : the project for the VM.
  • ZONE : the zone where the VM is located.
  • VM_NAME : the VM name.
  • NUMBER_OF_SECONDS : the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM, from 90 to 330, in increments of 30.

View host maintenance policy settings of a VM

Console

  1. Go to the VM instancespage.

    Go to VM instances

  2. Click the Nameof the VM for which you want to view settings. The VM instance detailspage opens.

  3. Go to the Managementsection. The Availability policiessubsection shows your current settings for On host maintenanceand Automatic restart.

gcloud

View the host maintenance option settings for a VM with the gcloud compute instances describe command :

 gcloud compute instances describe VM_NAME 
--format="yaml(scheduling)" 

Replace VM_NAME with the VM name.

The output includes the VM's host error detection timeout, for example:

 scheduling:
      automaticRestart: true
      localSsdRecoveryTimeout:
        nanos: 0
        seconds: '10800'
      onHostMaintenance: MIGRATE
      preemptible: false
      provisioningModel: STANDARD 

View the host error detection timeout setting

Preview — hostErrorTimeoutSeconds

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

View the current value of the hostErrorTimeoutSeconds with the gcloud beta compute instances describe command , because this option is only available in Preview .

 gcloud beta compute instances describe VM_NAME 
--format="yaml(scheduling)" 

Replace VM_NAME with the VM name.

The output includes the VM's host error detection timeout, for example:

 scheduling:
    automaticRestart: true
    hostErrorTimeoutSeconds: 120
    localSsdRecoveryTimeout:
      nanos: 0
      seconds: '10800'
    onHostMaintenance: MIGRATE
    preemptible: false
    provisioningModel: STANDARD 

REST

To view the host maintenance settings for a VM, use the instances.get method :

 GET https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID 
/zones/ ZONE 
/instances/ VM_NAME 
 

Replace the following:

  • PROJECT_ID : the project where the VM is located.
  • ZONE : the zone where the VM is located.
  • VM_NAME : the VM name.

In the output, the scheduling object contains the VM's host maintenance policy, for example:

 "scheduling": {
      "onHostMaintenance": "MIGRATE",
      "automaticRestart": true,
      "preemptible": false,
      "provisioningModel": "STANDARD",
      "localSsdRecoveryTimeout": {
        "seconds": "10800",
        "nanos": 0
      }
    } 

View the host error timeout settings

Preview — hostErrorTimeoutSeconds

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

View the current hostErrorTimeoutSeconds setting with a GET request to the beta instances.get method , because this option is only available in Preview .

 GET https://compute.googleapis.com/compute/beta/projects/ PROJECT_ID 
/zones/ ZONE 
/instances/ VM_NAME 
 

Replace the following:

  • PROJECT_ID : the project for the VM.
  • ZONE : the zone where the VM is located.
  • VM_NAME : the VM name.

In the output, the scheduling object includes the VM's host error detection timeout, for example:

 "scheduling": {
    "hostErrorTimeoutSeconds": 120
  } 

What's next