All Gemini 1.0 and Gemini 1.5 models are now retired.
To avoid service disruption, update to a newer model (for example, gemini-2.5-flash-lite ). Learn more.

Build hybrid experiences with on-device and cloud-hosted models

Build AI-powered apps and features with hybrid inference using Firebase AI Logic . Hybrid inference enables running inference using on-device models when available and seamlessly falling back to cloud-hosted models otherwise (and vice versa).

With this release, hybrid inference is available using the Firebase AI Logic client SDK for Web with support for on-device inference for Chrome on Desktop.

Jump to the code examples

Recommended use cases and supported capabilities

Recommended use cases:

Using an on-device model for inference offers:
- Enhanced privacy
- Local context
- Inference at no-cost
- Offline functionality
Using hybrid functionality offers:
- Reach 100% of your audience, regardless of on-device model availability or internet connectivity

Supported capabilities and features for on-device inference:

Single-turn content generation, streaming and non-streaming
Generating text from text-only input
Generating text from text-and-image input , specifically input image types of JPEG and PNG
Generating structured output , including JSON and enums

Get started

This guide shows you how to get started using the Firebase AI Logic SDK for Web to perform hybrid inference.

Inference using an on-device model uses the Prompt API from Chrome ; whereas inference using a cloud-hosted model uses your chosen Gemini API provider (either the Gemini Developer API or the Vertex AI Gemini API ).

Get started developing using localhost, as described in this section (you can also learn more about using APIs on localhost in the Chrome documentation). Then, once you've implemented your feature, you can optionally enable end-users to try out your feature .

Step 1: Set up Chrome and the Prompt API for on-device inference

Make sure you're using a recent version of Chrome. Update in chrome://settings/help .
On-device inference is available from Chrome v139 and higher.
Enable the on-device multimodal model by setting the following flag to Enabled:
- chrome://flags/#prompt-api-for-gemini-nano-multimodal-input
Restart Chrome.
(Optional) Download the on-device model before the first request.

The Prompt API is built into Chrome; however, the on-device model isn't available by default. If you haven't yet downloaded the model before your first request for on-device inference, the request will automatically start the model download in the background.

Note: Downloading the model can take several minutes, so waiting to auto-download with the first request can significantly delay receiving a response to that request.

View instructions to download the on-device model
1. Open Developer Tools > Console.
2. Run the following:
```
  await 
  
 LanguageModel 
 . 
 availability 
 (); 
 
```
3. Make sure that the output is available , downloading , or downloadable .
4. If the output is downloadable , start the model download by running:
```
  await 
  
 LanguageModel 
 . 
 create 
 (); 
 
```
5. You can use the following monitor callback to listen for download progress and make sure that the model is available before making requests:
```
  const 
  
 session 
  
 = 
  
 await 
  
 LanguageModel 
 . 
 create 
 ({ 
  
 monitor 
 ( 
 m 
 ) 
  
 { 
  
 m 
 . 
 addEventListener 
 ( 
 "downloadprogress" 
 , 
  
 ( 
 e 
 ) 
  
 = 
>  
 { 
  
 console 
 . 
 log 
 ( 
 `Downloaded 
 ${ 
 e 
 . 
 loaded 
  
 * 
  
 100 
 } 
 %` 
 ); 
  
 }); 
  
 }, 
 }); 
 
```

Build hybrid experiences with on-device and cloud-hosted models

Recommended use cases and supported capabilities

Get started

Step 1: Set up Chrome and the Prompt API for on-device inference

Step 2: Set up a Firebase project and connect your app to Firebase

Step 3: Add the SDK

Step 4: Initialize the service and create a model instance

Send a prompt request to a model

Generate text from text-only input

Generate text from text-and-image (multimodal) input

What else can you do?

Enable end-users to try out your feature

Use alternative inference modes

Determine whether on-device or in-cloud inference was used

Override the default fallback model

Use model configuration to control responses

Set the configuration for a cloud-hosted model

Set the configuration for an on-device model

Set the configuration for structured output (like JSON)

JSON output

Enum output

Features not yet available for on-device inference

Build hybrid experiences with on-device and cloud-hosted models Stay organized with collections Save and categorize content based on your preferences.

Recommended use cases and supported capabilities

Get started

Step 1: Set up Chrome and the Prompt API for on-device inference

Step 2: Set up a Firebase project and connect your app to Firebase

Step 3: Add the SDK

Step 4: Initialize the service and create a model instance

Send a prompt request to a model

Generate text from text-only input

Generate text from text-and-image (multimodal) input

What else can you do?

Enable end-users to try out your feature

Use alternative inference modes

Determine whether on-device or in-cloud inference was used

Override the default fallback model

Use model configuration to control responses

Set the configuration for a cloud-hosted model

Set the configuration for an on-device model

Set the configuration for structured output (like JSON)

JSON output

Enum output

Features not yet available for on-device inference

Build hybrid experiences with on-device and cloud-hosted models