The latest Gemini models, like Gemini 3.1 Flash Image ( Nano Banana 2 ), are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models will shut down on June 1, 2026 . To avoid service disruption, update to a newer model like gemini-3.1-flash-lite . Learn more.

All Imagen models will shut down on June 24, 2026 . Learn about migrating your apps to use Nano Banana.

Build hybrid experiences in Apple apps with on-device and cloud-hosted models

You can build AI-powered Apple apps and features with hybrid inference using Firebase AI Logic . Hybrid inference enables running inference using on-device models (specifically Apple's Foundation Models framework) when available and seamlessly falling back to cloud-hosted Google models otherwise (and vice versa).

This page describes how to get started using the client SDK , as well as showing additional configuration options and capabilities , like temperature.

Note that on-device inference via Firebase AI Logic is supported for Apple apps using Firebase AI Logic SDK v12.13.0+ and running on devices with Apple Intelligence enabled . It's governed by the Acceptable use requirements for Apple's Foundation Models framework .

Recommended use cases

Using an on-device model for inferenceoffers:
- Enhanced privacy
- Inference at no-cost
- Offline functionality
Using hybridfunctionality offers:
- Provide all customers with a similar app experience regardless of the end-user's device
- Improve availability of generative AI features, regardless of internet connectivity, quota limitations, or device capabilities

Supported capabilities, APIs, and devices

Before you implement hybrid and on-device inference using Firebase AI Logic, review this section to understand what's supported for Apple apps.

Supported capabilities and features for on-device inference

On-device inference only supports text generation, specifically the following text-generation capabilities:

Generating text from text-only input
Streaming text from text-only input

Make sure to review the detailed list for not-yet-supported hybrid or on-device inference at the bottom of this page.

Supported APIs and devices

In-cloud inference uses your chosen Gemini API provider (either the Gemini Developer API or the Vertex AI Gemini API ).
On-device inference uses Apple's Foundation Models framework, which is only available on devices with Apple Intelligence enabled . The on-device model is automatically downloaded when Apple Intelligence is enabled.

Important: On-device models usage is governed by the Acceptable use requirements for Apple's Foundation Models framework .

Get started

Make sure that you've reviewed the section above describing supported capabilities, APIs, and devices.

These get started steps describe the required general setup for any supported prompt request that you want to send.

Step 1: Set up a Firebase project and connect your app to Firebase

Sign into the Firebase console , and then select your Firebase project.

Don't already have a Firebase project?

If you don't already have a Firebase project, click the button to create a new Firebase project, and then use either of the following options:
- Option 1: Create a wholly new Firebase project (and its underlying Google Cloud project automatically) by entering a new project name in the first step of the workflow.
- Option 2: "Add Firebase" to an existing Google Cloud project by clicking Add Firebase to Google Cloud project(at bottom of page). In the first step of the workflow, start entering the project nameof the existing project, and then select the project from the displayed list.
Complete the remaining steps of the on-screen workflow to create a Firebase project. Note that when prompted, you do not need to set up Google Analytics to use the Firebase AI Logic SDKs.

// Initialize the Gemini Developer API backend service let ai = FirebaseAI . firebaseAI ( backend : . googleAI ()) // Initialize a cloud model that supports your use case let cloudModel = ai . geminiModel ( name : " GEMINI_MODEL_NAME " ) // Initialize an on-device model that supports your use case let systemModel = FirebaseAI . SystemLanguageModel . default // Create a Hybrid Model // Provide your preferred model as `primary` and your fallback model as `secondary` // In this example, attempt to use on-device model; otherwise, fall back to cloud. let hybridModel = HybridModel ( primary : systemModel , secondary : cloudModel ) // Create a GenerativeModelSession with the HybridModel created earlier. let session = firebaseAI . generativeModelSession ( model : hybridModel , )

// Imports + initialization of Gemini API backend service + creation of model session // Provide a prompt that contains text let prompt = "Write a story about a magic backpack." // To generate text output, call `respond(to:)` with the text input let response = try await session . respond ( to : prompt ) print ( response . content )

// Imports + initialization of Gemini API backend service + creation of model session // Provide a prompt that contains text let prompt = "Write a story about a magic backpack." // To stream generated text output, call `streamResponse(to:)` with the text input let stream = session . streamResponse ( to : prompt ) for try await snapshot in stream { print ( snapshot . content ) }

Features not-yet-supported for hybrid or on-device inference

As an experimental release, not all the capabilities of Firebase AI Logic or cloud-hosted models are supported.

The following are not supported for hybrid or on-device implementations : Imagen models, the Gemini Live API, and prompt templates. Also, count tokens shouldn't be relied upon because the count will differ between cloud-hosted and on-device models, so there's no intuitive fall back.

The following features are not yet supported for on-device inference. If you want to use any of these features, then we recommend using only a cloud-hosted model for a more consistent experience.

Generating text from multimodal inputs, like images, audio, video, and documents (PDFs)
Generating media, like images, audio, or video
Sending requests that exceed 4096 tokens (or approximately 3000 English words).
Providing the on-device model with built-in tools to help it generate its response (like code execution, URL context, and Grounding with Google Search)

AI monitoring in the Firebase console does not show any data for on-device inference (including on-device logs). However, any inference that uses a cloud-hosted model can be monitored just like other inference via Firebase AI Logic .

Additional limitations

In addition to the above, on-device inference has the following limitations:

The end-user of your app must be using a device with Apple Intelligence enabled .

Your app can run on-device inference only when your app is in the foreground.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-05-27 UTC.

Build hybrid experiences in Apple apps with on-device and cloud-hosted models

Recommended use cases

Supported capabilities, APIs, and devices

Supported capabilities and features for on-device inference

Supported APIs and devices

Get started

Step 1: Set up a Firebase project and connect your app to Firebase

Step 2: Add the required SDKs

Step 3: Initialize the service and create a model session instance

Step 4: Send a prompt request to a model

Generate text from text-only input

Stream text from text-only input

What else can you do?

Features not-yet-supported for hybrid or on-device inference

Additional limitations

Build hybrid experiences in Apple apps with on-device and cloud-hosted models Stay organized with collections Save and categorize content based on your preferences.

Recommended use cases

Supported capabilities, APIs, and devices

Supported capabilities and features for on-device inference

Supported APIs and devices

Get started

Step 1: Set up a Firebase project and connect your app to Firebase

Step 2: Add the required SDKs

Step 3: Initialize the service and create a model session instance

Step 4: Send a prompt request to a model

Generate text from text-only input

Stream text from text-only input

What else can you do?

Features not-yet-supported for hybrid or on-device inference

Additional limitations

Build hybrid experiences in Apple apps with on-device and cloud-hosted models