This document describes the provisioning models for Compute Engine instances. To learn more about deployment options, see Choose a Compute Engine deployment strategy for your workload .
Provisioning models determine the availability, lifespan, and pricing of your instances. If you understand these models, then you can choose the best option for your workload.
Available provisioning models
When you create a compute instance, you can specify one of the following provisioning models. If you don't specify a provisioning model, then Compute Engine uses the standard provisioning model by default.
-
Standard
-
Spot
-
Flex-start ( Preview )
-
Reservation-bound
The following table helps you compare the use cases and pricing for each provisioning model:
- Based on resource availability, you can immediately create instances.
- You can control when to stop or delete instances.
- Based on resource availability, you can immediately create instances.
- You can control when to stop or delete instances. However, you also allow Compute Engine to stop or delete instances at any time to reclaim capacity.
- After you create a zonal managed instance group (MIG), you request Compute Engine to add instances with GPUs attached to the MIG. Compute Engine schedules the provisioning of the instances based on resource availability.
- You can control when to delete instances. However, you can't stop, suspend, or recreate them. The instances run for up to seven days. Then, Compute Engine deletes them.
- You can request to reserve capacity at a future date for creating instances with GPUs attached. If Google Cloud approves your request, then Compute Engine creates a reservation. At the start of the reservation period, you can consume the reservation by creating GPU instances that match the reservation.
- During the approved reservation period, you can stop, restart, delete, and recreate instances to consume the reservation as needed. When the reservation period ends, Compute Engine deletes the reservation, and stops or deletes any instances that consume the reservation.
- Web servers
- Databases
- Enterprise applications
- Development and testing
- Batch processing
- High performance computing (HPC)
- Continuous integration and continuous deployment (CI/CD)
- Data analytics
- Media encoding
- Online inference
- Small model pre-training
- Model fine-tuning
- HPC simulation
- Batch inference
- For workloads that last up to 90 days
:
- Model pre-training jobs
- Model fine-tuning jobs
- HPC simulation workloads
- Short-term expected increases in inference workloads
- For workloads longer than 90 days
:
- Training workloads
- Inference workloads
- If you reserve capacity in AI Hypercomputer, then you incur charges based on accelerator-optimized VMs pricing .
- If you reserve capacity by using future reservations in calendar mode, then you incur charges based on the Dynamic Workload Scheduler (DWS) pricing .
Instance availability and lifespan
The following table shows you the compute instances availability and lifespan for each provisioning model:
To create instances, you must first reserve capacity using one of the following methods:
- To reserve capacity for long-running workloads, use future reservations in AI Hypercomputer .
- To reserve capacity for workloads that run for up to 90 days, use future reservations in calendar mode .
- M2 and M3
- G4
- Bare metal instances
- If you reserve capacity in AI Hypercomputer , then you can only use A4X, A4, and A3 Ultra machine series.
- If you create a future reservation in calendar mode , then you can only use A4 and A3 Ultra machine series.
- If the machine type that the instance uses doesn't support live migration, then Compute Engine stops your instances during host maintenance events .
- In rare cases, the instance may stop due to a host error.
- Compute Engine might stop or delete the instance at any time to reclaim capacity. This process is called preemption .
- If the machine type that the instance uses doesn't support live migration, then Compute Engine stops your instances during host maintenance events .
- In rare cases, the instance may stop due to a host error.
Compute Engine deletes instances when one of the following happens:
- You request to delete instances.
- The instances reach the end of their run duration.
- Compute Engine stops your instance during host maintenance events .
- The automatically created reservation to provision your requested capacity reaches the end of its committed reservation period. At that time, Compute Engine deletes the reservation, and stops or deletes any instances that consume the reservation.
- In rare cases, the instance may stop due to a host error.
What's next
-
To create instances by using the spot provisioning model, see Spot VMs .
-
To create instances by using the flex-start provisioning model, see About resize requests in a MIG .
-
To reserve capacity to create instances by using the reservation-bound model, see one of the following options: