Deployments and endpoints

Google and Partner models and generative AI features on Vertex AI are exposed as specific regional endpoints and a global endpoint. Global endpoints cover the entire world and provide higher availability and reliability than single regions.

Note that model endpoints don't guarantee region availability or in-region ML processing. For information about data residency, see Data residency .

Global endpoint

Selecting a global endpoint for your requests can improve overall availability while reducing resource exhausted (429) errors. Don't use the global endpoint if you have ML processing requirements, because you can't control or know which region your ML processing requests are sent to when a request is made.

Supported models

Usage of the global endpoint is supported for the following Google models in specified regions. For details about which regions support the global endpoint, see the Globaltab in the Google model endpoint locations table .

For information about global endpoint availability for partner models, see the Globaltab in the Google Cloud partner model endpoint locations table .

Use the global endpoint

To use the global endpoint, exclude the location from the endpoint name and configure the location of the resource to global . For example, the following is global endpoint URL:

 https:// aiplatform.googleapis.com/v1/projects/test-project/locations/ global/publishers/google/models/gemini-2.0-flash-001:generateContent 

For the Google Gen AI SDK , create a client that uses the global location:

  client 
 = 
 genai 
 . 
 Client 
 ( 
 vertexai 
 = 
 True 
 , 
 project 
 = 
 ' PROJECT_ID 
' 
 , 
 location 
 = 
 'global' 
 ) 
 

For the Vertex AI SDK for Python , initialize the SDK using the global location:

  import 
  
  vertexai 
 
 from 
  
 vertexai.generative_models 
  
 import 
  GenerativeModel 
 
  vertexai 
 
 . 
 init 
 ( 
 project 
 = 
 ' PROJECT_ID 
' 
 , 
 location 
 = 
 'global' 
 ) 
 

Limitations

The following capabilities are not available when using the global endpoint:

  • Tuning
  • Batch prediction for Anthropic and OpenMaaS models
  • Retrieval-augmented generation (RAG) corpus (RAG requests are supported)

Usage of the global endpoint with Provisioned Throughput is available only for the following models:

Model Latest supported model version
Gemini 2.5 Flash ( preview ) gemini-2.5-flash-preview-09-2025
Gemini 2.5 Flash-Lite ( preview ) gemini-2.5-flash-lite-preview-09-2025
Gemini 2.5 Flash Image gemini-2.5-flash-image
Gemini 2.5 Flash-Lite gemini-2.5-flash-lite
Gemini 2.5 Pro gemini-2.5-pro
Gemini 2.5 Flash gemini-2.5-flash
Gemini 2.0 Flash gemini-2.0-flash-001
Gemini 2.0 Flash-Lite gemini-2.0-flash-lite-001

Google model endpoint locations

Google model endpoints for Generative AI on Vertex AI are available in the following regions.

United States

Columbus, Ohio (us-east5) Dallas, Texas (us-south1) Iowa (us-central1) Las Vegas, Nevada (us-west4) Moncks Corner, South Carolina (us-east1) Northern Virginia (us-east4) Oregon (us-west1)
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio-09-2025 )
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio )
Gemini 2.5 Flash Image
( gemini-2.5-flash-image )
Gemini 2.5 Flash
( gemini-2.5-flash )
Gemini 2.5 Pro
( gemini-2.5-pro )
Gemini 2.5 Flash-Lite
( gemini-2.5-flash-lite )
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
Gemini 2.0 Flash-Lite
( gemini-2.0-flash-lite-001 )
Gemini Embeddings
( gemini-embedding-001 )
Embeddings for Text
Embeddings for Multimodal
Imagen
( imagegeneration@002 )
Imagen 2
( imagegeneration@005 )
Imagen 2
( imagegeneration@006 )
Imagen 3
( imagen-3.0-generate-001 )
Imagen 3 Fast
( imagen-3.0-fast-generate-001 )
Imagen 3 Editing and Customization
( imagen-3.0-capability-001 )
Imagen 3
( imagen-3.0-generate-002 )
Imagen 4
( imagen-4.0-generate-001 )
Imagen 4
( imagen-4.0-fast-generate-001 )
Imagen 4 Ultra Generate experimental
( imagen-4.0-ultra-generate-001 )
Veo 2
( veo-2.0-generate-001 )
Veo 3
( veo-3.0-generate-001 )
Veo 3 Fast
( veo-3.0-fast-generate-001 )
Veo 3 (Preview)
( veo-3.0-generate-preview )
Veo 3 Fast (Preview)
( veo-3.0-fast-generate-preview )
Chirp 3: Transcription ( chirp_3 )
Chirp 2: Transcription ( chirp_2 )
Gemini 2.5 Flash TTS ( gemini-2.5-flash-tts )
Gemini 2.5 Flash TTS ( gemini-2.5-pro-tts )
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Canada

* Montréal (northamerica-northeast1)
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio-09-2025 )
*
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio )
*
Gemini 2.5 Flash Image
( gemini-2.5-flash-image )
*
Gemini 2.5 Flash
( gemini-2.5-flash )
*
Gemini 2.5 Pro
( gemini-2.5-pro )
*
Gemini 2.5 Flash-Lite
( gemini-2.5-flash-lite )
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
*
Gemini 2.0 Flash-Lite
( gemini-2.0-flash-lite-001 )
*
Gemini Embeddings
( gemini-embedding-001 )
*
Embeddings for Text *
Embeddings for Multimodal *
Imagen
( imagegeneration@002 )
*
Imagen 2
( imagegeneration@005 )
*
Imagen 2
( imagegeneration@006 )
*
Imagen 3
( imagen-3.0-generate-001 )
*
Imagen 3 Fast
( imagen-3.0-fast-generate-001 )
*
Imagen 3 Editing and Customization
( imagen-3.0-capability-001 )
*
Imagen 3
( imagen-3.0-generate-002 )
*
Imagen 4
( imagen-4.0-generate-001 )
*
Imagen 4
( imagen-4.0-fast-generate-001 )
*
Imagen 4 Ultra Generate experimental
( imagen-4.0-ultra-generate-001 )
*
Chirp 3: Transcription ( chirp_3 )
Chirp 2: Transcription ( chirp_2 )
Gemini 2.5 Flash TTS ( gemini-2.5-flash-tts )
Gemini 2.5 Flash TTS ( gemini-2.5-pro-tts )
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

South America

* São Paulo, Brazil (southamerica-east1)
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio-09-2025 )
*
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio )
*
Gemini 2.5 Flash Image
( gemini-2.5-flash-image )
*
Gemini 2.5 Flash
( gemini-2.5-flash )
*
Gemini 2.5 Pro
( gemini-2.5-pro )
*
Gemini 2.5 Flash-Lite
( gemini-2.5-flash-lite )
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
*
Gemini 2.0 Flash-Lite
( gemini-2.0-flash-lite-001 )
*
Gemini Embeddings
( gemini-embedding-001 )
*
Embeddings for Text *
Embeddings for Multimodal *
Imagen
( imagegeneration@002 )
*
Imagen 2
( imagegeneration@005 )
*
Imagen 2
( imagegeneration@006 )
*
Imagen 3
( imagen-3.0-generate-001 )
*
Imagen 3 Fast
( imagen-3.0-fast-generate-001 )
*
Imagen 3 Editing and Customization
( imagen-3.0-capability-001 )
*
Imagen 3
( imagen-3.0-generate-002 )
*
Imagen 4
( imagen-4.0-generate-001 )
*
Imagen 4
( imagen-4.0-fast-generate-001 )
*
Imagen 4 Ultra Generate experimental
( imagen-4.0-ultra-generate-001 )
*
Chirp 3: Transcription ( chirp_3 )
Chirp 2: Transcription ( chirp_2 )
Gemini 2.5 Flash TTS ( gemini-2.5-flash-tts )
Gemini 2.5 Flash TTS ( gemini-2.5-pro-tts )
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Europe

* Netherlands (europe-west4) Paris, France (europe-west9) London, United Kingdom (europe-west2) Frankfurt, Germany (europe-west3) Belgium (europe-west1) Zürich, Switzerland (europe-west6) Madrid, Spain (europe-southwest1) Milan, Italy (europe-west8) Finland (europe-north1) Warsaw, Poland (europe-central2)
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio-09-2025 )
* * * * * * * * * *
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio )
* * * * * * * * * *
Gemini 2.5 Flash Image
( gemini-2.5-flash-image )
* * * * * * * * * *
Gemini 2.5 Flash
( gemini-2.5-flash )
* * * * * * * * * *
Gemini 2.5 Pro
( gemini-2.5-pro )
* * * * * * * * * *
Gemini 2.5 Flash-Lite
( gemini-2.5-flash-lite )
* * * * * * * * * *
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
* * * * * * * * * *
Gemini 2.0 Flash-Lite
( gemini-2.0-flash-lite-001 )
* * * * * * * * * *
Gemini Embeddings
( gemini-embedding-001 )
* * * * * * * * * *
Embeddings for Text * * * * * * * * * *
Embeddings for Multimodal * * * * * * * * * *
Imagen
( imagegeneration@002 )
* * * * *
Imagen 2
( imagegeneration@005 )
* * * * * * * * * *
Imagen 2
( imagegeneration@006 )
* * * * *
Imagen 3
( imagen-3.0-generate-001 )
* * * * * * * * * *
Imagen 3 Fast
( imagen-3.0-fast-generate-001 )
* * * * * * * * * *
Imagen 3 Editing and Customization
( imagen-3.0-capability-001 )
* * * * * * * * * *
Imagen 3
imagen-3.0-generate-002
* * * * * * * * * *
Imagen 4
( imagen-4.0-generate-001 )
* * * * * * * * * *
Imagen 4
( imagen-4.0-fast-generate-001 )
* * * * * * * * * *
Imagen 4 Ultra Generate experimental
( imagen-4.0-ultra-generate-001 )
* * * * * * * * * *
Chirp 3: Transcription ( chirp_3 ) * * * * * * * * * *
Chirp 2: Transcription ( chirp_2 ) * * * * * * * * * *
Gemini 2.5 Flash TTS ( gemini-2.5-flash-tts ) * * * * * * * * * *
Gemini 2.5 Flash TTS ( gemini-2.5-pro-tts ) * * * * * * * * * *
Chirp 3: HD Voices * * * * * * * * * *
Chirp 3: Instant Custom Voice * * * * * * * * * *

Asia Pacific

* Tokyo, Japan (asia-northeast1) Sydney, Australia (australia-southeast1) Singapore (asia-southeast1) Seoul, Korea (asia-northeast3) Taiwan (asia-east1) Hong Kong, China (asia-east2) Mumbai, India (asia-south1)
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio-09-2025 )
* * * * * * *
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio )
* * * * * * *
Gemini 2.5 Flash Image
( gemini-2.5-flash-image )
* * * * * * *
Gemini 2.5 Flash
( gemini-2.5-flash )
* * * * * * *
Gemini 2.5 Pro
( gemini-2.5-pro )
* * * * * * *
Gemini 2.5 Flash-Lite
( gemini-2.5-flash-lite )
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
* * * * * * *
Gemini 2.0 Flash-Lite
( gemini-2.0-flash-lite-001 )
* * * * * * *
Gemini Embeddings
( gemini-embedding-001 )
* * * * * * *
Embeddings for Text * * * * * * *
Embeddings for Multimodal * * * * * * *
Imagen
( imagegeneration@002 )
* * * *
Imagen 2
( imagegeneration@005 )
* * * * * * *
Imagen 2
( imagegeneration@006 )
* * * *
Imagen 3
( imagen-3.0-generate-001 )
* * * * * * *
Imagen 3 Fast
( imagen-3.0-fast-generate-001 )
* * * * * * *
Imagen 3 Editing and Customization
( imagen-3.0-capability-001 )
* * * * * * *
Imagen 3
( imagen-3.0-generate-002 )
* * * * * * *
Imagen 4
( imagen-4.0-generate-001 )
* * * * * * *
Imagen 4
( imagen-4.0-fast-generate-001 )
* * * * * * *
Imagen 4 Ultra Generate experimental
( imagen-4.0-ultra-generate-001 )
* * * * * * *
Chirp 3: Transcription ( chirp_3 ) * * * * * * * * * *
Chirp 2: Transcription ( chirp_2 ) * * * * * * * * * *
Gemini 2.5 Flash TTS ( gemini-2.5-flash-tts ) * * * * * * * * * *
Gemini 2.5 Flash TTS ( gemini-2.5-pro-tts ) * * * * * * * * * *
Chirp 3: HD Voices * * * * * * * * * *
Chirp 3: Instant Custom Voice * * * * * * * * * *

Middle East

* Dammam, Saudi Arabia (me-central2) Doha, Qatar (me-central1) Tel Aviv, Israel (me-west1)
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio-09-2025 )
* * *
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio )
* * *
Gemini 2.5 Flash Image
( gemini-2.5-flash-image )
* * *
Gemini 2.5 Flash
( gemini-2.5-flash )
* * *
Gemini 2.5 Pro
( gemini-2.5-pro )
* * *
Gemini 2.5 Flash-Lite
( gemini-2.5-flash-lite )
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
* * *
Gemini 2.0 Flash-Lite
( gemini-2.0-flash-lite-001 )
* * *
Gemini Embeddings
( gemini-embedding-001 )
* * *
Embeddings for Text * * *
Embeddings for Multimodal * * *
Imagen
( imagegeneration@002 )
Imagen 2
( imagegeneration@005 )
* * *
Imagen 2
( imagegeneration@006 )
Imagen 3
( imagen-3.0-generate-001 )
* * *
Imagen 3 Fast
( imagen-3.0-fast-generate-001 )
* * *
Imagen 3 Editing and Customization
( imagen-3.0-capability-001 )
* * *
Imagen 3
( imagen-3.0-generate-002 )
* * *
Imagen 4
( imagen-4.0-generate-001 )
* * *
Imagen 4
( imagen-4.0-fast-generate-001 )
* * *
Imagen 4 Ultra Generate experimental
( imagen-4.0-ultra-generate-001 )
* * *
Chirp 3: Transcription ( chirp_3 ) * * * * * * * * * *
Chirp 2: Transcription ( chirp_2 ) * * * * * * * * * *
Gemini 2.5 Flash TTS ( gemini-2.5-flash-tts ) * * * * * * * * * *
Gemini 2.5 Flash TTS ( gemini-2.5-pro-tts ) * * * * * * * * * *
Chirp 3: HD Voices * * * * * * * * * *
Chirp 3: Instant Custom Voice * * * * * * * * * *

Global

Global (global)
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio-09-2025 )
Gemini 2.5 Flash with Live API native audio
( gemini-live-2.5-flash-preview-native-audio )
Gemini 2.5 Flash Image
( gemini-2.5-flash-image )
Gemini 2.5 Flash
( gemini-2.5-flash )
Gemini 2.5 Pro
( gemini-2.5-pro )
Gemini 2.5 Flash-Lite
( gemini-2.5-flash-lite )
Gemini 2.0 Flash
( gemini-2.0-flash-001 )
Gemini 2.0 Flash-Lite
( gemini-2.0-flash-lite-001 )
Gemini Embeddings
( gemini-embedding-001 )
Embeddings for Text
Embeddings for Multimodal
Imagen
( imagegeneration@002 )
Imagen 2
( imagegeneration@005 )
Imagen 2
( imagegeneration@006 )
Imagen 3
( imagen-3.0-generate-001 )
Imagen 3 Fast
( imagen-3.0-fast-generate-001 )
Imagen 3 Editing and Customization
( imagen-3.0-capability-001 )
Imagen 3
( imagen-3.0-generate-002 )
Imagen 4
( imagen-4.0-generate-001 )
Imagen 4
( imagen-4.0-fast-generate-001 )
Imagen 4 Ultra Generate experimental
( imagen-4.0-ultra-generate-001 )
Chirp 3: Transcription ( chirp_3 )
Chirp 2: Transcription ( chirp_2 )
Gemini 2.5 Flash TTS ( gemini-2.5-flash-tts )
Gemini 2.5 Flash TTS ( gemini-2.5-pro-tts )
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

* Region is available only while using Single Zone Provisioned Throughput or batch prediction, and supervised fine-tuning isn't supported in this region.

Google Cloud partner model endpoint locations

Google serves requests from the region that you specified. For some models, Google also offers a global endpoint to improve overall availability and reduce error rates. The global endpoint can have a separate set of quotas from the regional endpoint and doesn't support data residency requirements. For more information, see the "Regional and global endpoint" section in Vertex AI partner models for MaaS .

Partner model endpoints for Generative AI on Vertex AI are available in the following regions:

United States

* Columbus, Ohio (us-east5) Dallas, Texas (us-south1) Iowa (us-central1) Las Vegas, Nevada (us-west4) Moncks Corner, South Carolina (us-east1) Northern Virginia (us-east4) Oregon (us-west1)
Anthropic's Claude Sonnet 4.5 *
Anthropic's Claude Opus 4.1 *
Anthropic's Claude Haiku 4.5 *
Anthropic's Claude Opus 4 *
Anthropic's Claude Sonnet 4 *
Anthropic's Claude 3.7 Sonnet *
Anthropic's Claude 3.5 Haiku *
Anthropic's Claude 3 Haiku *
Mistral Medium 3 * * *
Mistral OCR (25.05) * * *
Mistral Small 3.1 (25.03) * * *
Mistral Large (24.07) * * *
Codestral 2 * * *
Codestral (24.05) * * *

Europe

Netherlands (europe-west4) Belgium (europe-west1)
Anthropic's Claude Sonnet 4.5
Anthropic's Claude Opus 4.1
Claude Haiku 4.5
Anthropic's Claude Opus 4
Anthropic's Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Asia Pacific

Singapore (asia-southeast1) Taiwan (asia-east1)
Anthropic's Claude Sonnet 4.5
Anthropic's Claude Opus 4.1
Anthropic's Claude Haiku 4.5
Anthropic's Claude Opus 4
Anthropic's Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Global

* Global (global)
Anthropic's Claude Sonnet 4.5 *
Anthropic's Claude Opus 4.1 * *
Anthropic's Claude Haiku 4.5 * *
Anthropic's Claude Opus 4 * *
Anthropic's Claude Sonnet 4 *
Anthropic's Claude 3.7 Sonnet *
Anthropic's Claude 3.5 Haiku *
Anthropic's Claude 3 Haiku *
Mistral Medium 3 *
Mistral OCR (25.05) *
Mistral Small 3.1 (25.03) *
Mistral Large (24.07) *
Codestral 2 *
Codestral (24.05) *

Google Cloud open model endpoint locations

Google serves requests from the region that you specified. For some models, Google also offers a global endpoint to improve overall availability and reduce error rates. The global endpoint can have a separate set of quotas from the regional endpoint and doesn't support data residency requirements. For more information, see the "Regional and global endpoint" section in Vertex AI open models for MaaS .

Open model endpoints for Generative AI on Vertex AI are available in the following regions:

United States

* Columbus, Ohio (us-east5) Dallas, Texas (us-south1) Iowa (us-central1) Las Vegas, Nevada (us-west4) Moncks Corner, South Carolina (us-east1) Northern Virginia (us-east4) Oregon (us-west1)
DeepSeek R1 (0528) * * *
Llama 4 Maverick 17B-128E (Preview) *
Llama 4 Scout 17B-16E (Preview) *
Llama 3.3 70B (Preview) * * * * * * *
Llama 3.2 90B (Preview) * * * * * * *
Llama 3.1 405B * * * * * * *
Llama 3.1 70B (Preview) * * * * * * *
Llama 3.1 8B (Preview) * * * * * * *
Multilingual E5 Small * * * * * * *
Multilingual E5 Large * * * * * * *

Europe

* Netherlands (europe-west4) Belgium (europe-west1)
DeepSeek R1 (0528) *
Llama 4 Maverick 17B-128E (Preview) *
Llama 4 Scout 17B-16E (Preview) *
Llama 3.3 70B (Preview) * * *
Llama 3.2 90B (Preview) * * *
Llama 3.1 405B * * *
Llama 3.1 70B (Preview) * * *
Llama 3.1 8B (Preview) * * *
Multilingual E5 Small * *
Multilingual E5 Large * *

Asia Pacific

Singapore (asia-southeast1) Taiwan (asia-east1)
DeepSeek R1 (0528)
Llama 4 Maverick 17B-128E (Preview)
Llama 4 Scout 17B-16E (Preview)
Llama 3.3 70B (Preview)
Llama 3.2 90B (Preview)
Llama 3.1 405B
Llama 3.1 70B (Preview)
Llama 3.1 8B (Preview)

Global

Global (global)
DeepSeek R1 (0528)
Llama 4 Maverick 17B-128E (Preview)
Llama 4 Scout 17B-16E (Preview)
Llama 3.3 70B (Preview)
Llama 3.2 90B (Preview)
Llama 3.1 405B
Llama 3.1 70B (Preview)
Llama 3.1 8B (Preview)

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: