The latest Gemini models, like Gemini 3.1 Flash Image ( Nano Banana 2 ), are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models will shut down on June 1, 2026 . To avoid service disruption, update to a newer model like gemini-2.5-flash-lite . Learn more.

All Imagen models will shut down on June 24, 2026 . Learn about migrating your apps to use Nano Banana.

Configuration options for hybrid experiences in Web apps

This page describes the following configuration options:

Set an inference mode
Override the default cloud-hosted fallback model
Use model configuration to control responses , like temperature

You can also generate structured output , including JSON and enums.

Before you begin

Make sure that you've completed the getting started guide for building hybrid experiences .

Set an inference mode

The examples in the getting started guide use the PREFER_ON_DEVICE mode, but this is only one of the four available inference modes .

PREFER_ON_DEVICE : Use the on-device model if it's available; otherwise, fall back to the cloud-hosted model .

  const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 mode 
 : 
  
 InferenceMode 
 . 
 PREFER_ON_DEVICE 
  
 });

ONLY_ON_DEVICE : Use the on-device model if it's available; otherwise, throw an exception .

  const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 mode 
 : 
  
 InferenceMode 
 . 
 ONLY_ON_DEVICE 
  
 });

PREFER_IN_CLOUD : Use the cloud-hosted model if it's available; otherwise, fall back to the on-device model .

  const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 mode 
 : 
  
 InferenceMode 
 . 
 PREFER_IN_CLOUD 
  
 });

ONLY_IN_CLOUD : Use the cloud-hosted model if it's available; otherwise, throw an exception .

  const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 mode 
 : 
  
 InferenceMode 
 . 
 ONLY_IN_CLOUD 
  
 });

Determine whether on-device or in-cloud inference was used

If you use PREFER_ON_DEVICE or PREFER_IN_CLOUD inference modes, then it might be helpful to know which mode was used for given requests. This information is provided by the inferenceSource property of each response (available starting with JS SDK v12.5.0).

When you access this property, the returned value will be either ON_DEVICE or IN_CLOUD .

  // ... 
 console 
 . 
 log 
 ( 
 'You used: ' 
  
 + 
  
 result 
 . 
 response 
 . 
 inferenceSource 
 ); 
 console 
 . 
 log 
 ( 
 result 
 . 
 response 
 . 
 text 
 ());

Override the default fallback model

The default cloud-hosted model is gemini-2.5-flash-lite (starting with JS SDK v12.8.0).

This model is the fallback cloud-hosted model when you use the PREFER_ON_DEVICE mode. It's also the default model when you use the ONLY_IN_CLOUD mode or the PREFER_IN_CLOUD mode.

You can use the inCloudParams configuration option to specify an alternative default cloud-hosted model.

  const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 mode 
 : 
  
 InferenceMode 
 . 
  INFERENCE_MODE 
 
 , 
  
 inCloudParams 
 : 
  
 { 
  
 model 
 : 
  
 " GEMINI_MODEL_NAME 
" 
  
 } 
 });

Find model names for all supported Gemini models .

Use model configuration to control responses

In each request to a model, you can send along a model configuration to control how the model generates a response. Cloud-hosted models and on-device models offer different configuration options.

The configuration is maintained for the lifetime of the instance. If you want to use a different config, create a new GenerativeModel instance with that config.

Configure cloud-hosted model

Use the inCloudParams option to configure a cloud-hosted Gemini model. Learn about available parameters .

  const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 mode 
 : 
  
 InferenceMode 
 . 
  INFERENCE_MODE 
 
 , 
  
 inCloudParams 
 : 
  
 { 
  
 model 
 : 
  
 " GEMINI_MODEL_NAME 
" 
  
 temperature 
 : 
  
 0.8 
 , 
  
 topK 
 : 
  
 10 
  
 } 
 });

Configure on-device model

Note that inference using an on-device model uses the Prompt API from Chrome .

Use the onDeviceParams option to configure an on-device model. Learn about available parameters .

  const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 mode 
 : 
  
 InferenceMode 
 . 
  INFERENCE_MODE 
 
 , 
  
 onDeviceParams 
 : 
  
 { 
  
 createOptions 
 : 
  
 { 
  
 temperature 
 : 
  
 0.8 
 , 
  
 topK 
 : 
  
 8 
  
 } 
  
 } 
 });