Build hybrid experiences with on-device and cloud-hosted models
Stay organized with collectionsSave and categorize content based on your preferences.
Build AI-powered apps and features with hybrid inference usingFirebase AI Logic. Hybrid inference enables running inference using
on-device models when available and seamlessly falling back to cloud-hosted
models otherwise.
With this release, hybrid inference is available using theFirebase AI Logicclient SDK for Web with support for on-device
inference for Chrome on Desktop.
Reach 100% of your audience, regardless of on-device model availability
Supported capabilities and features for on-device inference:
Single-turn content generation, streaming and non-streaming
Generating text from text-only input
Generating text from text-and-image input, specifically input image types of
JPEG and PNG
Generating structured output, including JSON and enums
Get started
This guide shows you how to get started using theFirebase AI Logic SDK for
Web to perform hybrid inference.
Inference using an on-device model uses thePrompt API from Chrome;
whereas inference using a cloud-hosted model uses your chosenGemini APIprovider (either theGemini Developer APIor theVertex AIGemini API).
Get started developing using localhost, as described in this section
(you can also learn more aboutusing APIs on localhostin the Chrome documentation). Then, once you've implemented your feature, you
can optionallyenable end-users to try out your feature.
Step 1: Set up Chrome and the Prompt API for on-device inference
Make sure you're using a recent version of Chrome. Update inchrome://settings/help. On-device inference is available from Chrome v139 and higher.
Enable the on-device multimodal model by setting the following flag toEnabled:
(Optional)Download the on-device model before the first request.
The Prompt API is built into Chrome; however, the on-device model isn't
available by default. If you haven't yet downloaded the model before your
first request for on-device inference, the request will automatically start
the model download in the background.
View instructions to download the on-device model
OpenDeveloper Tools > Console.
Run the following:
awaitLanguageModel.availability();
Make sure that the output isavailable,downloading, ordownloadable.
If the output isdownloadable, start the model download by running:
awaitLanguageModel.create();
You can use the followingmonitorcallback to listen for download
progress and make sure that the model isavailablebefore making
requests:
Step 2: Set up a Firebase project and connect your app to Firebase
Sign into theFirebaseconsole,
and then select your Firebase project.
Don't already have a Firebase project?
If you don't already have a Firebase project, click the button to create a
new Firebase project, and then use either of the following options:
Option 1: Create a wholly new Firebase project (and its underlyingGoogle Cloudproject automatically) by entering a new project name in the
first step of the workflow.
Option 2: "Add Firebase" to an existingGoogle Cloudproject by
clickingAdd Firebase to Google Cloud project(at bottom of page).
In the first step of the workflow, start entering theproject nameof
the existing project, and then select the project from the displayed list.
Complete the remaining steps of the on-screen workflow to create a Firebase
project. Note that when prompted, you donotneed to set upGoogle Analyticsto use theFirebase AI Logic SDKs.
ClickGet startedto launch a guided workflow that helps you set up therequired APIsand resources for your project.
Select the "Gemini API" provider that you'd like to use with theFirebase AI Logic SDKs.Gemini Developer APIis
recommended for first-time users. You can always add billing or set upVertex AIGemini APIlater, if you'd like.
Gemini Developer API—billing optional(available on the no-cost Spark pricing plan, and you can upgrade later if
desired) The console will enable the required APIs and create aGeminiAPI key in your project. Donotadd thisGeminiAPI key into your app's codebase.Learn more.
Vertex AIGemini API—billing required(requires the pay-as-you-go Blaze pricing plan) The console will help you set up billing and enable the
required APIs in your project.
If prompted in the console's workflow, follow the on-screen instructions to
register your app and connect it to Firebase.
Continue to the next step in this guide to add the SDK to your app.
Step 3: Add the SDK
The Firebase library provides access to the APIs for interacting with generative
models. The library is included as part of the Firebase JavaScript SDK for Web.
Install the Firebase JS SDK for Web using npm:
npm install firebase
Initialize Firebase in your app:
import{initializeApp}from"firebase/app";// TODO(developer) Replace the following with your app's Firebase configuration// See: https://firebase.google.com/docs/web/learn-more#config-objectconstfirebaseConfig={// ...};// Initialize FirebaseAppconstfirebaseApp=initializeApp(firebaseConfig);
Step 4: Initialize the service and create a model instance
Click yourGemini APIprovider to view provider-specific content
and code on this page.
Before sending a prompt to aGeminimodel, initialize the service for
your chosen API provider and create aGenerativeModelinstance.
Set themodeto one of:
PREFER_ON_DEVICE: Configures the SDK to use the on-device model if it's
available, or fall back to the cloud-hosted model.
ONLY_ON_DEVICE: Configures the SDK to use the on-device model or throw
an exception.
ONLY_IN_CLOUD: Configures the SDK to never use the on-device model.
By default when you usePREFER_ON_DEVICEorONLY_IN_CLOUD, the cloud-hosted
model isgemini-2.0-flash-lite, but you canoverride the default.
import{initializeApp}from"firebase/app";import{getAI,getGenerativeModel,GoogleAIBackend,InferenceMode}from"firebase/ai";// TODO(developer) Replace the following with your app's Firebase configuration// See: https://firebase.google.com/docs/web/learn-more#config-objectconstfirebaseConfig={// ...};// Initialize FirebaseAppconstfirebaseApp=initializeApp(firebaseConfig);// Initialize the Gemini Developer API backend serviceconstai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Create a `GenerativeModel` instance// Set the mode, for example to use on-device model when possibleconstmodel=getGenerativeModel(ai,{mode:InferenceMode.PREFER_ON_DEVICE});
Send a prompt request to a model
This section provides examples for how to send various types of input to
generate different types of output, including:
Before trying this sample, make sure that you've completed theGet startedsection of this guide.
You can usegenerateContent()to generate text from a prompt that contains text:
// Imports + initialization of FirebaseApp and backend service + creation of model instance// Wrap in an async function so you can use awaitasyncfunctionrun(){// Provide a prompt that contains textconstprompt="Write a story about a magic backpack."// To generate text output, call `generateContent` with the text inputconstresult=awaitmodel.generateContent(prompt);constresponse=result.response;consttext=response.text();console.log(text);}run();
Generate text from text-and-image (multimodal) input
Before trying this sample, make sure that you've completed theGet startedsection of this guide.
You can usegenerateContent()to generate text from a prompt that contains text and image files—providing each
input file'smimeTypeand the file itself.
The supported input image types for on-device inference are PNG and JPEG.
// Imports + initialization of FirebaseApp and backend service + creation of model instance// Converts a File object to a Part object.asyncfunctionfileToGenerativePart(file){constbase64EncodedDataPromise=newPromise((resolve)=>{constreader=newFileReader();reader.onloadend=()=>resolve(reader.result.split(',')[1]);reader.readAsDataURL(file);});return{inlineData:{data:awaitbase64EncodedDataPromise,mimeType:file.type},};}asyncfunctionrun(){// Provide a text prompt to include with the imageconstprompt="Write a poem about this picture:";constfileInputEl=document.querySelector("input[type=file]");constimagePart=awaitfileToGenerativePart(fileInputEl.files[0]);// To generate text output, call `generateContent` with the text and imageconstresult=awaitmodel.generateContent([prompt,imagePart]);constresponse=result.response;consttext=response.text();console.log(text);}run();
To enable end-users to try out your feature, you canenroll in the Chrome Origin Trials.
Note that there's a limited duration and usage for these trials.
The examples above used thePREFER_ON_DEVICEmode to configure the SDK to use
an on-device model if it's available, or fall back to a cloud-hosted model. The
SDK offers two alternativeinference modes:ONLY_ON_DEVICEandONLY_IN_CLOUD.
UseONLY_ON_DEVICEmode so that the SDK can only use an on-device
model. In this configuration, the API will throw an error if an on-device
model is not available.
When you use thePREFER_ON_DEVICEmode, the SDK will fall back to using a
cloud-hosted model if an on-device model is unavailable. The default fallback
cloud-hosted model isgemini-2.0-flash-lite. This cloud-hosted model is also
the default when you use theONLY_IN_CLOUDmode.
You can use theinCloudParamsconfiguration option to specify an alternative default cloud-hosted model:
In each request to a model, you can send along a model configuration to control
how the model generates a response. Cloud-hosted models and on-device models
offer different configuration options.
The configuration is maintained for the lifetime of the instance. If you want to
use a different config, create a newGenerativeModelinstance with that
config.
Generating structured output (like JSON and enums) is supported for
inference using both cloud-hosted and on-device models.
For hybrid inference, use bothinCloudParamsandonDeviceParamsto configure the model to respond with structured output. For the other modes,
use only the applicable configuration.
ForinCloudParams: Specify the appropriateresponseMimeType(in
this example,application/json) as well as theresponseSchemathat you
want the model to use.
ForonDeviceParams: Specify theresponseConstraintthat you
want the model to use.
Features not yet available for on-device inference
As an experimental release, not all the capabilities of the Web SDK are
available foron-deviceinference.The following features arenot yet
supported for on-device inference(but they are usually available for
cloud-based inference).
Generating text from image file input types other than JPEG and PNG
Can fallback to the cloud-hosted model;
however,ONLY_ON_DEVICEmode will throw an error.
Generating text from audio, video, and documents (like PDFs) inputs
Can fallback to the cloud-hosted model;
however,ONLY_ON_DEVICEmode will throw an error.
Generating images usingGeminiorImagenmodels
Can fallback to the cloud-hosted model;
however,ONLY_ON_DEVICEmode will throw an error.
Providing files using URLs in multimodal requests. You must provide files as
inline data to on-device models.
Multi-turn chat
Can fallback to the cloud-hosted model;
however,ONLY_ON_DEVICEmode will throw an error.
Bi-directional streaming with theGemini Live API
Note that this isn't supported by theFirebase AI Logicclient SDK for Webeven for cloud-hosted models.
Using "tools", including function calling and grounding with Google Search
Coming soon!
Count tokens
Always throws an error. The count will differ between cloud-hosted and
on-device models, so there is no intuitive fallback.
AI monitoring in theFirebaseconsole for on-device inference.
Note that any inference using the cloud-hosted models can be
monitored just like other inference using theFirebase AI Logicclient SDK for Web.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-05 UTC."],[],[],null,[]]