The latest Gemini models, likeGemini 3.1 Flash Image(Nano Banana 2), are available to use with Firebase AI Logic!Learn more.
Gemini 2.0 Flash and Flash-Lite models will shut down onJune 1, 2026. To avoid service disruption, update to a newer model likegemini-2.5-flash-lite.Learn more.
Build hybrid experiences in Android apps with on-device and cloud-hosted modelsStay organized with collectionsSave and categorize content based on your preferences.
You can build AI-powered Android apps and features with hybrid inference usingFirebase AI Logic. Hybrid inference enables running inference using
on-device models when available and seamlessly falling back to
cloud-hosted models otherwise (and vice versa).
Reach more of your audience by accommodating on-device model availability
and internet connectivity
Supported capabilities and features for on-device inference
On-device inference only supportssingle-turn text generation (notchat),
with streaming or non-streaming output. It supports the following
text-generation capabilities:
These get started steps describe the required general setup for any supported
prompt request that you want to send.
Step 1: Set up a Firebase project and connect your app to Firebase
Sign into theFirebaseconsole,
and then select your Firebase project.
Don't already have a Firebase project?
If you don't already have a Firebase project, click the button to create a
new Firebase project, and then use either of the following options:
Option 1: Create a wholly new Firebase project (and its underlyingGoogle Cloudproject automatically) by entering a new project name in the
first step of the workflow.
Option 2: "Add Firebase" to an existingGoogle Cloudproject by
clickingAdd Firebase to Google Cloud project(at bottom of page).
In the first step of the workflow, start entering theproject nameof
the existing project, and then select the project from the displayed list.
Complete the remaining steps of the on-screen workflow to create a Firebase
project. Note that when prompted, you donotneed to set upGoogle Analyticsto use theFirebase AI Logic SDKs.
ClickGet startedto launch a guided workflow that helps you set up therequired APIsand resources for your project.
Set up your project to use a "Gemini API" provider.
We recommend getting started using theGemini Developer API.At any point, you can alwaysset up theVertex AIGemini API(and its requirement for billing).
For theGemini Developer API, the console will enable the required
APIs and create aGeminiAPI key in your project. Donotadd thisGeminiAPI key into your app's codebase.Learn more.
If prompted in the console's workflow, follow the on-screen instructions to
register your app and connect it to Firebase.
Continue to the next step in this guide to add the SDK to your app.
Step 2: Add the required SDKs
TheFirebase AI Logic SDK for Android
(firebase-ai) along with theFirebase AI Logic On-DeviceSDK
(firebase-ai-ondevice)
provide access to the APIs for interacting with generative models.
In yourmodule (app-level) Gradle file(like<project>/<app-module>/build.gradle.kts), add the dependencies for theFirebase AI Logiclibraries for Android:
Kotlin
dependencies{// ... other androidx dependencies// Add the dependencies for theFirebase AI Logiclibraries// Note that the on-device SDK is not yet included in theFirebase Android BoMimplementation("com.google.firebase:firebase-ai:17.11.0")implementation("com.google.firebase:firebase-ai-ondevice:16.0.0-beta01")}
Java
For Java, you need to add two additional libraries.
dependencies{// ... other androidx dependencies// Add the dependencies for theFirebase AI Logiclibraries// Note that the on-device SDK is not yet included in theFirebase Android BoMimplementation("com.google.firebase:firebase-ai:17.11.0")implementation("com.google.firebase:firebase-ai-ondevice:16.0.0-beta01")// Required for one-shot operations (to use `ListenableFuture` from Guava Android)implementation("com.google.guava:guava:31.0.1-android")// Required for streaming operations (to use `Publisher` from Reactive Streams)implementation("org.reactivestreams:reactive-streams:1.0.4")}
Step 3: Check if the on-device model is available
UsingFirebaseAIOnDevice,
check if the on-device model is available, and download the model if it's not
available.
Once downloaded, AICore will automatically keep the model updated. Check out the
notes after the snippet for more details about AICore and managing the
on-device model download.
Kotlin
valstatus=FirebaseAIOnDevice.checkStatus()when(status){OnDeviceModelStatus.UNAVAILABLE->{Log.w(TAG,"On-device model is unavailable")}OnDeviceModelStatus.DOWNLOADABLE->{FirebaseAIOnDevice.download().collect{status->when(status){isDownloadStatus.DownloadStarted->Log.w(TAG,"Starting download -${status.bytesToDownload}")isDownloadStatus.DownloadInProgress->Log.w(TAG,"Download in progress${status.totalBytesDownloaded}bytes downloaded")isDownloadStatus.DownloadCompleted->Log.w(TAG,"On-device model download complete")isDownloadStatus.DownloadFailed->Log.e(TAG,"Download failed${status}")}}}OnDeviceModelStatus.DOWNLOADING->{Log.w(TAG,"On-device model is being downloaded")}OnDeviceModelStatus.AVAILABLE->{Log.w(TAG,"On-device model is available")}}
Java
Checking for and downloading the model is not yet available for Java.
However, all other APIs and interactions in this guide are available for Java.
Note the following about downloading the on-device model:
The time it takes to download the on-device model depends on many factors,
including your network.
If your code uses an on-device model for its primary or fallback inference,
make sure the model is downloaded early in your app's lifecycle so that the
on-device model is available before your end-users encounter the code in your
app.
If the on-device model isnot availablewhen an on-device inference request
is made, the SDK willnot automaticallytrigger the download of the
on-device model. The SDK will either fall back to the cloud-hosted model or
throw an exception (see details about the behavior ofinference modes).
AICore(an Android system service) manages for you which model and version is
downloaded, keeping the model updated, etc. Note that the device will only
have one model downloaded, so if another app on the device has previously
successfully downloaded the on-device model, then this check will return
that the model is available.
Latency optimization
To optimize for the first inference call, you can have your app callwarmup().
This loads the on-device model into memory and initializes runtime components.
Step 4: Initialize the service and create a model instance
Click yourGemini APIprovider to view provider-specific content
and code on this page.
Set up the following before you send a prompt request to the model.
Initialize the service for your chosen API provider.
Create aGenerativeModelinstance, and set themodeto one of the
following. The descriptions here are very high-level, but you can learn
details about the behavior of these modes inSet an inference mode.
PREFER_ON_DEVICE: Attempt to use on-device model;
otherwise,fall back to the cloud-hosted model.
ONLY_ON_DEVICE: Attempt to use on-device model;
otherwise,throw an exception.
PREFER_IN_CLOUD: Attempt to use the cloud-hosted model;
otherwise,fall back to the on-device model.
ONLY_IN_CLOUD: Attempt to use the cloud-hosted model;
otherwise,throw an exception.
Kotlin
// Using this SDK to access on-device inference is an Experimental release and requires opt-in@OptIn(PublicPreviewAPI::class)// ...// Initialize the Gemini Developer API backend service// Create a GenerativeModel instance with a model that supports your use case// Set the inference mode (like PREFER_ON_DEVICE to use the on-device model if available)valmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(modelName="MODEL_NAME",onDeviceConfig=OnDeviceConfig(mode=InferenceMode.PREFER_ON_DEVICE))
Java
// Initialize the Gemini Developer API backend service// Create a GenerativeModel instance with a model that supports your use case// Set the inference mode (like PREFER_ON_DEVICE to use the on-device model if available)GenerativeModelai=FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel("MODEL_NAME",newOnDeviceConfig(InferenceMode.PREFER_ON_DEVICE));// Use the GenerativeModelFutures Java compatibility layer which offers// support for ListenableFuture and Publisher APIsGenerativeModelFuturesmodel=GenerativeModelFutures.from(ai);
Step 5: Send a prompt request to a model
This section shows you how to send various types of input to generate different
types of output, including:
Before trying this sample, make sure that you've completed theGet startedsection of this guide.
You can usegenerateContent()to generate text from a prompt that contains text:
Kotlin
// Imports + initialization of Gemini API backend service + creation of model instance// Provide a prompt that contains textvalprompt="Write a story about a magic backpack."// To generate text output, call generateContent with the text inputvalresponse=model.generateContent(prompt)print(response.text)
Java
// Imports + initialization of Gemini API backend service + creation of model instance// Provide a prompt that contains textContentprompt=newContent.Builder().addText("Write a story about a magic backpack.").build();// To generate text output, call generateContent with the text inputListenableFuture<GenerateContentResponse>response=model.generateContent(prompt);Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){@OverridepublicvoidonSuccess(GenerateContentResponseresult){StringresultText=result.getText();System.out.println(resultText);}@OverridepublicvoidonFailure(Throwablet){t.printStackTrace();}},executor);
Note thatFirebase AI Logicalso supports streaming of text responses usinggenerateContentStream(instead ofgenerateContent).
Generate text from text-and-image (multimodal) input
Before trying this sample, make sure that you've completed theGet startedsection of this guide.
You can usegenerateContent()to generate text from a prompt that contains text andup tooneimage file
(Bitmap only)— providing each input file'smimeTypeand the file itself.
Kotlin
// Imports + initialization of Gemini API backend service + creation of model instance// Loads an image from the app/res/drawable/ directoryvalbitmap:Bitmap=BitmapFactory.decodeResource(resources,R.drawable.sparky)// Provide a prompt that includes the image specified above and textvalprompt=content{image(bitmap)text("What developer tool is this mascot from?")}// To generate text output, call generateContent with the promptvalresponse=model.generateContent(prompt)print(response.text)
Java
// Imports + initialization of Gemini API backend service + creation of model instanceBitmapbitmap=BitmapFactory.decodeResource(getResources(),R.drawable.sparky);// Provide a prompt that includes the image specified above and textContentcontent=newContent.Builder().addImage(bitmap).addText("What developer tool is this mascot from?").build();// To generate text output, call generateContent with the promptListenableFuture<GenerateContentResponse>response=model.generateContent(content);Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){@OverridepublicvoidonSuccess(GenerateContentResponseresult){StringresultText=result.getText();System.out.println(resultText);}@OverridepublicvoidonFailure(Throwablet){t.printStackTrace();}},executor);
Note thatFirebase AI Logicalso supports streaming of text responses usinggenerateContentStream(instead ofgenerateContent).
What else can you do?
You can use various additional configuration options and capabilities for your
hybrid experiences:
Features not yet available for on-device inference
As an experimental release, not all the capabilities of cloud models are
available foron-deviceinference.
The features listed in this section arenot yet available for on-device
inference.If you want to use any of these features, then we recommend using
theONLY_IN_CLOUDinference mode for a more consistent experience.
Generating structured output (like JSON or enums)
Generating text from image file input types other than Bitmap
(image loaded into memory)
Generating text from more than one image file
Generating text from audio, video, and documents (like PDFs) inputs
Generating images usingGeminiorImagenmodels
Providing files using URLs in multimodal requests. You must provide files as
inline data to on-device models
Sending requests that exceed 4000 tokens
(or approximately 3000 English words).
Multi-turn chat
Providing the model withtoolsto help it generate its response
(like function calling, code execution, URL context, and grounding with
Google Search)
AI monitoring in theFirebaseconsole doesnotshow any data for
on-device inference (including on-device logs). However, any inference that uses
a cloud-hosted model can be monitored just like other inference viaFirebase AI Logic.
Additional limitations
In addition to the above,on-deviceinference has the following
limitations(learn more in theML Kit documentation):
The end-user of your app must be using asupported devicefor on-device inference.
Your app can only run on-device inference when it's in the foreground.
Only English and Korean have been validated for on-device inference.
The maximum token limit for the entire on-device inference request is
4000 tokens. If your requests might exceed this limit, then make sure to
configure an inference mode that can use a cloud-hosted model.
We recommend avoiding on-device inference use cases that require long
output (more than 256 tokens).
AICore(an Android system service that manages the on-device models) enforces an
inference quotaper app. Making too many API requests in a short period
will result in anErrorCode.BUSYresponse. If you're receiving this
error, consider using exponential backoff to retry the request. Also,ErrorCode.PER_APP_BATTERY_USE_QUOTA_EXCEEDEDcan be returned if an app
exceeds a long-duration quota (for example, daily quota).
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-04-17 UTC."],[],[]]