The latest Gemini models, likeGemini 3.1 Flash Image(Nano Banana 2), are available to use with Firebase AI Logic!Learn more.
Gemini 2.0 Flash and Flash-Lite models will shut down onJune 1, 2026. To avoid service disruption, update to a newer model likegemini-3.1-flash-lite.Learn more.
Build hybrid experiences in Apple apps with on-device and cloud-hosted modelsStay organized with collectionsSave and categorize content based on your preferences.
You can build AI-powered Apple apps and features with hybrid inference usingFirebase AI Logic. Hybrid inference enables running inference using
on-device models (specifically Apple's Foundation Models framework) when
available and seamlessly falling back to cloud-hosted Google models otherwise
(and vice versa).
On-device inference uses Apple's Foundation Models framework, which is
only available ondevices with Apple Intelligence enabled.
The on-device model is automatically downloaded when Apple Intelligence
is enabled.
Get started
Make sure that you've reviewed the section above describing supported
capabilities, APIs, and devices.
These get started steps describe the required general setup for any supported
prompt request that you want to send.
Step 1: Set up a Firebase project and connect your app to Firebase
Sign into theFirebaseconsole,
and then select your Firebase project.
Don't already have a Firebase project?
If you don't already have a Firebase project, click the button to create a
new Firebase project, and then use either of the following options:
Option 1: Create a wholly new Firebase project (and its underlyingGoogle Cloudproject automatically) by entering a new project name in the
first step of the workflow.
Option 2: "Add Firebase" to an existingGoogle Cloudproject by
clickingAdd Firebase to Google Cloud project(at bottom of page).
In the first step of the workflow, start entering theproject nameof
the existing project, and then select the project from the displayed list.
Complete the remaining steps of the on-screen workflow to create a Firebase
project. Note that when prompted, you donotneed to set upGoogle Analyticsto use theFirebase AI Logic SDKs.
ClickGet startedto launch a guided workflow that helps you set up therequired APIsand resources for your project.
Set up your project to use a "Gemini API" provider.
We recommend getting started using theGemini Developer API.At any point, you can alwaysset up theVertex AIGemini API(and its requirement for billing).
For theGemini Developer API, the console will enable the required
APIs and create aGeminiAPI key in your project. Donotadd thisGeminiAPI key into your app's codebase.Learn more.
If prompted in the console's workflow, follow the on-screen instructions to
register your app and connect it to Firebase.
Continue to the next step in this guide to add the SDK to your app.
Step 2: Add the required SDKs
Use Swift Package Manager (SPM) to install and manage Xcode dependencies. Hybrid
support is only available when using SPM.
TheFirebase AI Logiclibrary provides access to the APIs for interacting
with generative models. The library is included as part of the Firebase SDK
for Apple platforms (firebase-ios-sdk).
If you're already using Firebase, then make sure your Firebase package is
v12.13.0 or later.
In Xcode, with your app project open, navigate toFile > Add Package Dependencies.
When prompted, add the Firebase Apple platforms SDK repository:
https://github.com/firebase/firebase-ios-sdk
Select the latest SDK version.
Select theFirebaseAILogiclibrary.
When finished, Xcode will automatically begin resolving and downloading your
dependencies in the background.
Step 3: Initialize the service and create a model session instance
Click yourGemini APIprovider to view provider-specific content
and code on this page.
Set up the following before you send a prompt request to the model.
Initialize the service for your chosenGemini APIprovider.
Create aGenerativeModelSessioninstance with aHybridModel.
Set theprimaryandsecondarymodels based on your preferences. You can
set the order of attempted inference:
Attempt on-device inference first, but allow fallback to cloud: setprimaryto a "system" model andsecondaryto a cloud model.
Attempt in-cloud inference first, but allow fallback to on-device:
setprimaryto a cloud model andsecondaryto a "system" model.
Note that the SDK supports settingonlya singlemodelwhich means the
SDK willonlyattempt either on-device or in-cloud inference. However, for
ahybridexperience, you need to create aHybridModeland set bothprimaryandsecondarymodels.
Learn more about the behaviour of "inference modes" (the order of attempted
inference) inConfiguration options.
The following example shows how to attempt on-device inference first, but allow
fall back to the cloud-hosted model:
// Initialize the Gemini Developer API backend serviceletai=FirebaseAI.firebaseAI(backend:.googleAI())// Initialize a cloud model that supports your use caseletcloudModel=ai.geminiModel(name:"GEMINI_MODEL_NAME")// Initialize an on-device model that supports your use caseletsystemModel=FirebaseAI.SystemLanguageModel.default// Create a Hybrid Model// Provide your preferred model as `primary` and your fallback model as `secondary`// In this example, attempt to use on-device model; otherwise, fall back to cloud.lethybridModel=HybridModel(primary:systemModel,secondary:cloudModel)// Create a GenerativeModelSession with the HybridModel created earlier.letsession=firebaseAI.generativeModelSession(model:hybridModel,)
Before trying this sample, make sure that you've completed theGet startedsection of this guide.
To generate text from a prompt that contains text, userespond(to:)like so:
// Imports + initialization of Gemini API backend service + creation of model session// Provide a prompt that contains textletprompt="Write a story about a magic backpack."// To generate text output, call `respond(to:)` with the text inputletresponse=tryawaitsession.respond(to:prompt)print(response.content)
Stream text from text-only input
Before trying this sample, make sure that you've completed theGet startedsection of this guide.
You can achieve faster interactions by not waiting for the entire result from
the model generation, and instead usestreamingto handle partial results. Tostreamgenerated text from a prompt that contains text, usestreamResponse(to:)like so:
// Imports + initialization of Gemini API backend service + creation of model session// Provide a prompt that contains textletprompt="Write a story about a magic backpack."// To stream generated text output, call `streamResponse(to:)` with the text inputletstream=session.streamResponse(to:prompt)fortryawaitsnapshotinstream{print(snapshot.content)}
What else can you do?
You can use various additional configuration options and capabilities for your
hybrid experiences:
Features not-yet-supported for hybrid or on-device inference
As an experimental release, not all the capabilities ofFirebase AI Logicor cloud-hosted models are supported.
The following arenot supported for hybrid or on-device
implementations: Imagen models, the Gemini Live API, and prompt
templates. Also, count tokens shouldn't be relied upon because the count will
differ between cloud-hosted and on-device models, so there's no intuitive
fall back.
The following features arenot yet supported for on-device inference.If you want to use any of these features, then we recommend using only a
cloud-hosted model for a more consistent experience.
Generating text from multimodal inputs, like images, audio, video, and
documents (PDFs)
Generating media, like images, audio, or video
Sending requests that exceed 4096 tokens (or approximately 3000 English
words).
Providing the on-device model withbuilt-in toolsto help it generate
its response (like code execution, URL context, and Grounding with
Google Search)
AI monitoring in theFirebaseconsole doesnotshow any data for
on-device inference (including on-device logs). However, any inference that
uses a cloud-hosted model can be monitored just like other inference viaFirebase AI Logic.
Additional limitations
In addition to the above,on-deviceinference has the following
limitations:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-05-27 UTC."],[],[]]