Build hybrid experiences in Apple apps with on-device and cloud-hosted models

You can build AI-powered Apple apps and features with hybrid inference using Firebase AI Logic . Hybrid inference enables running inference using on-device models (specifically Apple's Foundation Models framework) when available and seamlessly falling back to cloud-hosted Google models otherwise (and vice versa).

This page describes how to get started using the client SDK , as well as showing additional configuration options and capabilities , like temperature.

Note that on-device inference via Firebase AI Logic is supported for Apple apps using Firebase AI Logic  SDK v12.13.0+ and running on devices with Apple Intelligence enabled . It's governed by the Acceptable use requirements for Apple's Foundation Models framework .

Recommended use cases

  • Using an on-device model for inferenceoffers:

    • Enhanced privacy
    • Inference at no-cost
    • Offline functionality
  • Using hybridfunctionality offers:

    • Provide all customers with a similar app experience regardless of the end-user's device
    • Improve availability of generative AI features, regardless of internet connectivity, quota limitations, or device capabilities

Supported capabilities, APIs, and devices

Before you implement hybrid and on-device inference using Firebase AI Logic, review this section to understand what's supported for Apple apps.

Supported capabilities and features for on-device inference

On-device inference only supports text generation, specifically the following text-generation capabilities:

Make sure to review the detailed list for not-yet-supported hybrid or on-device inference at the bottom of this page.

Supported APIs and devices

Get started

Make sure that you've reviewed the section above describing supported capabilities, APIs, and devices.

These get started steps describe the required general setup for any supported prompt request that you want to send.

Step 1: Set up a Firebase project and connect your app to Firebase

  1. Sign into the Firebase console , and then select your Firebase project.

    Don't already have a Firebase project?

    If you don't already have a Firebase project, click the button to create a new Firebase project, and then use either of the following options:

    • Option 1: Create a wholly new Firebase project (and its underlying Google Cloud project automatically) by entering a new project name in the first step of the workflow.

    • Option 2: "Add Firebase" to an existing Google Cloud project by clicking Add Firebase to Google Cloud project(at bottom of page). In the first step of the workflow, start entering the project nameof the existing project, and then select the project from the displayed list.

    Complete the remaining steps of the on-screen workflow to create a Firebase project. Note that when prompted, you do not need to set up Google Analytics to use the Firebase AI Logic  SDKs.

  • In the Firebase console, go to AI Services> AI Logic .

  • Click Get startedto launch a guided workflow that helps you set up the required APIs and resources for your project.

  • Set up your project to use a " Gemini API " provider.

    We recommend getting started using the Gemini Developer API .At any point, you can always set up the Vertex AI Gemini API (and its requirement for billing).

    For the Gemini Developer API , the console will enable the required APIs and create a Gemini API key in your project.
    Do notadd this Gemini API key into your app's codebase. Learn more.

  • If prompted in the console's workflow, follow the on-screen instructions to register your app and connect it to Firebase.

  • Continue to the next step in this guide to add the SDK to your app.

  • Step 2: Add the required SDKs

    Use Swift Package Manager (SPM) to install and manage Xcode dependencies. Hybrid support is only available when using SPM.

    The Firebase AI Logic library provides access to the APIs for interacting with generative models. The library is included as part of the Firebase SDK for Apple platforms ( firebase-ios-sdk ).

    If you're already using Firebase, then make sure your Firebase package is v12.13.0 or later.

    1. In Xcode, with your app project open, navigate to File > Add Package Dependencies.

    2. When prompted, add the Firebase Apple platforms SDK repository:

       https://github.com/firebase/firebase-ios-sdk 
      
    3. Select the latest SDK version.

    4. Select the FirebaseAILogic library.

    When finished, Xcode will automatically begin resolving and downloading your dependencies in the background.

    Step 3: Initialize the service and create a model session instance

    Click your Gemini API provider to view provider-specific content and code on this page.

    Set up the following before you send a prompt request to the model.

    1. Initialize the service for your chosen Gemini API provider.

    2. Create a GenerativeModelSession instance with a HybridModel .

    3. Set the primary and secondary models based on your preferences. You can set the order of attempted inference:

      • Attempt on-device inference first, but allow fallback to cloud: set primary to a "system" model and secondary to a cloud model.

      • Attempt in-cloud inference first, but allow fallback to on-device: set primary to a cloud model and secondary to a "system" model.

      Note that the SDK supports setting only a single model which means the SDK will only attempt either on-device or in-cloud inference. However, for a hybrid experience, you need to create a HybridModel and set both primary and secondary models.

      Learn more about the behaviour of "inference modes" (the order of attempted inference) in Configuration options .

    The following example shows how to attempt on-device inference first, but allow fall back to the cloud-hosted model:

      // Initialize the Gemini Developer API backend service 
     let 
      
     ai 
      
     = 
      
     FirebaseAI 
     . 
     firebaseAI 
     ( 
     backend 
     : 
      
     . 
     googleAI 
     ()) 
     // Initialize a cloud model that supports your use case 
     let 
      
     cloudModel 
      
     = 
      
     ai 
     . 
     geminiModel 
     ( 
     name 
     : 
      
     " GEMINI_MODEL_NAME 
    " 
     ) 
     // Initialize an on-device model that supports your use case 
     let 
      
     systemModel 
      
     = 
      
     FirebaseAI 
     . 
     SystemLanguageModel 
     . 
     default 
     // Create a Hybrid Model 
     // Provide your preferred model as `primary` and your fallback model as `secondary` 
     // In this example, attempt to use on-device model; otherwise, fall back to cloud. 
     let 
      
     hybridModel 
      
     = 
      
     HybridModel 
     ( 
      
     primary 
     : 
      
     systemModel 
     , 
      
     secondary 
     : 
      
     cloudModel 
     ) 
     // Create a GenerativeModelSession with the HybridModel created earlier. 
     let 
      
     session 
      
     = 
      
     firebaseAI 
     . 
     generativeModelSession 
     ( 
      
     model 
     : 
      
     hybridModel 
     , 
     ) 
     
    

    Step 4: Send a prompt request to a model

    This section shows you how to do the following:

    Generate text from text-only input

    Before trying this sample, make sure that you've completed the Get started section of this guide.

    To generate text from a prompt that contains text, use respond(to:) like so:

      // Imports + initialization of Gemini API backend service + creation of model session 
     // Provide a prompt that contains text 
     let 
      
     prompt 
      
     = 
      
     "Write a story about a magic backpack." 
     // To generate text output, call `respond(to:)` with the text input 
     let 
      
     response 
      
     = 
      
     try 
      
     await 
      
     session 
     . 
     respond 
     ( 
     to 
     : 
      
     prompt 
     ) 
     print 
     ( 
     response 
     . 
     content 
     ) 
     
    

    Stream text from text-only input

    Before trying this sample, make sure that you've completed the Get started section of this guide.

    You can achieve faster interactions by not waiting for the entire result from the model generation, and instead use streaming to handle partial results. To stream generated text from a prompt that contains text, use streamResponse(to:) like so:

      // Imports + initialization of Gemini API backend service + creation of model session 
     // Provide a prompt that contains text 
     let 
      
     prompt 
      
     = 
      
     "Write a story about a magic backpack." 
     // To stream generated text output, call `streamResponse(to:)` with the text input 
     let 
      
     stream 
      
     = 
      
     session 
     . 
     streamResponse 
     ( 
     to 
     : 
      
     prompt 
     ) 
     for 
      
     try 
      
     await 
      
     snapshot 
      
     in 
      
     stream 
      
     { 
      
     print 
     ( 
     snapshot 
     . 
     content 
     ) 
     } 
     
    

    What else can you do?

    You can use various additional configuration options and capabilities for your hybrid experiences:

    Features not-yet-supported for hybrid or on-device inference

    As an experimental release, not all the capabilities of Firebase AI Logic or cloud-hosted models are supported.

    • The following are not supported for hybrid or on-device implementations : Imagen models, the Gemini Live API, and prompt templates. Also, count tokens shouldn't be relied upon because the count will differ between cloud-hosted and on-device models, so there's no intuitive fall back.

    • The following features are not yet supported for on-device inference. If you want to use any of these features, then we recommend using only a cloud-hosted model for a more consistent experience.

      • Generating text from multimodal inputs, like images, audio, video, and documents (PDFs)

      • Generating media, like images, audio, or video

      • Sending requests that exceed 4096 tokens (or approximately 3000 English words).

      • Providing the on-device model with built-in tools to help it generate its response (like code execution, URL context, and Grounding with Google Search)

    • AI monitoring in the Firebase console does not show any data for on-device inference (including on-device logs). However, any inference that uses a cloud-hosted model can be monitored just like other inference via Firebase AI Logic .

    Additional limitations

    In addition to the above, on-device inference has the following limitations:


    Give feedback about your experience with Firebase AI Logic


    Create a Mobile Website
    View Site in Mobile | Classic
    Share by: