Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle .

Authenticate

To use the OpenAI Python libraries, install the OpenAI SDK:

 pip  
install  
openai

To authenticate with the Chat Completions API, you can either modify your client setup or change your environment configuration to use Google authentication and a Vertex AI endpoint. Choose whichever method that's easier, and follow the steps for setting up depending on whether you want to call Gemini models or self-deployed Model Garden models.

Certain models in Model Garden and supported Hugging Face models need to be deployed to a Vertex AI endpoint first before they can serve requests. When calling these self-deployed models from the Chat Completions API, you need to specify the endpoint ID. To list your existing Vertex AI endpoints, use the gcloud ai endpoints list command .

Client setup

To programmatically get Google credentials in Python, you can use the google-auth Python SDK:

 pip  
install  
google-auth  
requests

Python

Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries . For more information, see the Vertex AI Python API reference documentation .

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 openai 
 from 
  
 google.auth 
  
 import 
 default 
 import 
  
 google.auth.transport.requests 
 # TODO(developer): Update and un-comment below lines 
 # project_id = "PROJECT_ID" 
 # location = "us-central1" 
 # Programmatically get an access token 
 credentials 
 , 
 _ 
 = 
 default 
 ( 
 scopes 
 = 
 [ 
 "https://www.googleapis.com/auth/cloud-platform" 
 ]) 
 credentials 
 . 
 refresh 
 ( 
 google 
 . 
 auth 
 . 
 transport 
 . 
 requests 
 . 
 Request 
 ()) 
 # Note: the credential lives for 1 hour by default (https://cloud.google.com/docs/authentication/token-types#at-lifetime); after expiration, it must be refreshed. 
 ############################## 
 # Choose one of the following: 
 ############################## 
 # If you are calling a Gemini model, set the ENDPOINT_ID variable to use openapi. 
 ENDPOINT_ID 
 = 
 "openapi" 
 # If you are calling a self-deployed model from Model Garden, set the 
 # ENDPOINT_ID variable and set the client's base URL to use your endpoint. 
 # ENDPOINT_ID = "YOUR_ENDPOINT_ID" 
 # OpenAI Client 
 client 
 = 
 openai 
 . 
 OpenAI 
 ( 
 base_url 
 = 
 f 
 "https:// 
 { 
 location 
 } 
 -aiplatform.googleapis.com/v1/projects/ 
 { 
 project_id 
 } 
 /locations/ 
 { 
 location 
 } 
 /endpoints/ 
 { 
 ENDPOINT_ID 
 } 
 " 
 , 
 api_key 
 = 
 credentials 
 . 
 token 
 , 
 )

By default, service account access tokens last for 1 hour. You can extend the life of service account access tokens or periodically refresh your token and update the openai.api_key variable.

Environment variables

Install the Google Cloud CLI. The OpenAI library can read the OPENAI_API_KEY and OPENAI_BASE_URL environment variables to change the authentication and endpoint in their default client. Set the following variables:

 $  
 export 
  
 PROJECT_ID 
 = 
 PROJECT_ID 
$  
 export 
  
 LOCATION 
 = 
 LOCATION 
$  
 export 
  
 OPENAI_API_KEY 
 = 
 " 
 $( 
gcloud  
auth  
application-default  
print-access-token ) 
 "

To call a Gemini model, set the MODEL_ID variable and use the openapi endpoint:

 $  
 export 
  
 MODEL_ID 
 = 
 MODEL_ID 
$  
 export 
  
 OPENAI_BASE_URL 
 = 
 "https:// 
 ${ 
 LOCATION 
 } 
 -aiplatform.googleapis.com/v1beta1/projects/ 
 ${ 
 PROJECT_ID 
 } 
 /locations/ 
 ${ 
 LOCATION 
 } 
 /endpoints/openapi"

To call a self-deployed model from Model Garden, set the ENDPOINT variable and use that in your URL instead:

 $  
 export 
  
 ENDPOINT 
 = 
 ENDPOINT_ID 
$  
 export 
  
 OPENAI_BASE_URL 
 = 
 "https:// 
 ${ 
 LOCATION 
 } 
 -aiplatform.googleapis.com/v1beta1/projects/ 
 ${ 
 PROJECT_ID 
 } 
 /locations/ 
 ${ 
 LOCATION 
 } 
 /endpoints/ 
 ${ 
 ENDPOINT 
 } 
 "

Next, initialize the client:

  client 
 = 
 openai 
 . 
 OpenAI 
 ()

The Gemini Chat Completions API uses OAuth to authenticate with a short-lived access token . By default, service account access tokens last for 1 hour. You can extend the life of service account access tokens or periodically refresh your token and update the openai.api_key variable.

Refresh your credentials

The following example shows how to refresh your credentials automatically as needed:

Python

  from 
  
 typing 
  
 import 
 Any 
 import 
  
 google.auth 
 import 
  
 google.auth.transport.requests 
 import 
  
 openai 
 class 
  
 OpenAICredentialsRefresher 
 : 
 def 
  
 __init__ 
 ( 
 self 
 , 
 ** 
 kwargs 
 : 
 Any 
 ) 
 - 
> None 
 : 
 # Set a placeholder key here 
 self 
 . 
 client 
 = 
 openai 
 . 
 OpenAI 
 ( 
 ** 
 kwargs 
 , 
 api_key 
 = 
 "PLACEHOLDER" 
 ) 
 self 
 . 
 creds 
 , 
 self 
 . 
 project 
 = 
 google 
 . 
 auth 
 . 
 default 
 ( 
 scopes 
 = 
 [ 
 "https://www.googleapis.com/auth/cloud-platform" 
 ] 
 ) 
 def 
  
 __getattr__ 
 ( 
 self 
 , 
 name 
 : 
 str 
 ) 
 - 
> Any 
 : 
 if 
 not 
 self 
 . 
 creds 
 . 
 valid 
 : 
 self 
 . 
 creds 
 . 
 refresh 
 ( 
 google 
 . 
 auth 
 . 
 transport 
 . 
 requests 
 . 
 Request 
 ()) 
 if 
 not 
 self 
 . 
 creds 
 . 
 valid 
 : 
 raise 
 RuntimeError 
 ( 
 "Unable to refresh auth" 
 ) 
 self 
 . 
 client 
 . 
 api_key 
 = 
 self 
 . 
 creds 
 . 
 token 
 return 
 getattr 
 ( 
 self 
 . 
 client 
 , 
 name 
 ) 
 # TODO(developer): Update and un-comment below lines 
 # project_id = "PROJECT_ID" 
 # location = "us-central1" 
 client 
 = 
 OpenAICredentialsRefresher 
 ( 
 base_url 
 = 
 f 
 "https:// 
 { 
 location 
 } 
 -aiplatform.googleapis.com/v1/projects/ 
 { 
 project_id 
 } 
 /locations/ 
 { 
 location 
 } 
 /endpoints/openapi" 
 , 
 ) 
 response 
 = 
 client 
 . 
 chat 
 . 
 completions 
 . 
 create 
 ( 
 model 
 = 
 "google/gemini-2.0-flash-001" 
 , 
 messages 
 = 
 [{ 
 "role" 
 : 
 "user" 
 , 
 "content" 
 : 
 "Why is the sky blue?" 
 }], 
 ) 
 print 
 ( 
 response 
 )

What's next

See examples of calling the Chat Completions API with the OpenAI-compatible syntax.
See examples of calling the Inference API with the OpenAI-compatible syntax.
See examples of calling the Function Calling API with OpenAI-compatible syntax.
Learn more about the Gemini API .
Learn more about migrating from Azure OpenAI to the Gemini API .

Authenticate Stay organized with collections Save and categorize content based on your preferences.

Client setup

Python

Environment variables

Refresh your credentials

Python

What's next

Authenticate