Configuration options for hybrid experiences in Apple apps


This page describes the following configuration options for hybrid and on-device experiences:

Make sure that you've completed the getting started guide for building hybrid experiences .

Set an "inference mode"

The examples in the getting started guide show how to implement attempting on-device inference first, and then falling back to the cloud-hosted model. This is only one of the available "inference modes" that you can implement.

Hybrid inference

  • Prefer on-device inference: set primary to a "system" model and secondary to a cloud model.

    Attempt to use the on-device model if it's available and supports the type of request. Otherwise, log an error on the device and then automatically fall back to the cloud-hosted model .

      // Imports + initialization of Gemini API backend service 
     // ... 
     // Initialize a cloud model that supports your use case 
     let 
      
     cloudModel 
      
     = 
      
     ai 
     . 
     geminiModel 
     ( 
     name 
     : 
      
     "GEMINI_MODEL_NAME" 
     ) 
     // Initialize an on-device model that supports your use case 
     let 
      
     systemModel 
      
     = 
      
     FirebaseAI 
     . 
     SystemLanguageModel 
     . 
     default 
     // Create a GenerativeModelSession with a hybrid model. 
     // Provide your preferred model as `primary` and your fallback model as `secondary` 
     // Attempt to use the on-device model; otherwise, fall back to the cloud-hosted model. 
     let 
      
     session 
      
     = 
      
     ai 
     . 
     generativeModelSession 
     ( 
      
     model 
     : 
      
     . 
     hybridModel 
     ( 
     primary 
     : 
      
     systemModel 
     , 
      
     secondary 
     : 
      
     cloudModel 
     ) 
     ) 
     
    
  • Prefer in-cloud inference: set primary to a cloud model and secondary to a "system" model.

    Attempt to use the cloud-hosted model if the device is online and if the model is available. If the device is offline, fall back to the on-device model . In all other failure cases, throw an exception .

      // Imports + initialization of Gemini API backend service 
     // ... 
     // Initialize a cloud model that supports your use case 
     let 
      
     cloudModel 
      
     = 
      
     ai 
     . 
     geminiModel 
     ( 
     name 
     : 
      
     " GEMINI_MODEL_NAME 
    " 
     ) 
     // Initialize an on-device model that supports your use case 
     let 
      
     systemModel 
      
     = 
      
     FirebaseAI 
     . 
     SystemLanguageModel 
     . 
     default 
     // Create a GenerativeModelSession with a hybrid model. 
     // Provide your preferred model as `primary` and your fallback model as `secondary` 
     // Attempt to use the cloud-hosted model; otherwise, fall back to the on-device model. 
     let 
      
     session 
      
     = 
      
     ai 
     . 
     generativeModelSession 
     ( 
      
     model 
     : 
      
     . 
     hybridModel 
     ( 
     primary 
     : 
      
     cloudModel 
     , 
      
     secondary 
     : 
      
     systemModel 
     ) 
     ) 
     
    

Only on-device or only in-cloud inference

The SDK supports setting only a single model which means the SDK will only attempt either on-device or in-cloud inference. Also, you don't create a HybridModel for this use case. However, for a hybrid experience, you do need to create a HybridModel and set both primary and secondary models (as described above).

  • Only on-device inference: set model to a "system" model. You don't create a HybridModel for this use case.

    Attempt to use the on-device model if it's available and supports the type of request. Otherwise, throw an exception .

      // Imports + initialization of Gemini API backend service 
     // ... 
     // Initialize an on-device model that supports your use case 
     let 
      
     systemModel 
      
     = 
      
     FirebaseAI 
     . 
     SystemLanguageModel 
     . 
     default 
     // Create a GenerativeModelSession with the on-device model. 
     let 
      
     session 
      
     = 
      
     ai 
     . 
     generativeModelSession 
     ( 
      
     model 
     : 
      
     systemModel 
     ) 
     
    
  • Only in-cloud inference: set model to a cloud model. You don't create a HybridModel for this use case.

    Attempt to use the cloud-hosted model if the device is online and if the model is available. Otherwise, throw an exception .

      // Imports + initialization of Gemini API backend service 
     // ... 
     // Initialize a cloud model that supports your use case 
     let 
      
     cloudModel 
      
     = 
      
     ai 
     . 
     geminiModel 
     ( 
     name 
     : 
      
     " GEMINI_MODEL_NAME 
    " 
     ) 
     // Create a GenerativeModelSession with a cloud model. 
     let 
      
     session 
      
     = 
      
     ai 
     . 
     generativeModelSession 
     ( 
      
     model 
     : 
      
     cloudModel 
     ) 
     
    

Check if the on-device model is available

Manual checks for on-device availability are only necessary if you want to surface that information to the user or request that end-users take action to download the on-device model. If the on-device model is unavailable – and you've set primary to an on-device model and secondary to a cloud model – then the SDK will automatically fallback to using the cloud-hosted model.

To manually check whether the on-device model is actually usable, inspect the isAvailable property:

  if 
  
 FirebaseAI 
 . 
 SystemLanguageModel 
 . 
 default 
 . 
 isAvailable 
  
 { 
  
 // The on-device model is ready to use. 
 } 
  
 else 
  
 { 
  
 // The on-device model is unavailable. 
 } 
 

To check for specific on-device model availability reasons, inspect the availability property:

  switch 
  
 FirebaseAI 
 . 
 SystemLanguageModel 
 . 
 default 
 . 
 availability 
  
 { 
 case 
  
 . 
 available 
 : 
  
 // The on-device model is ready to use. 
  
 break 
 case 
  
 . 
 unavailable 
 (. 
 deviceNotEligible 
 ): 
  
 // This device does not support Apple Intelligence. 
  
 break 
 case 
  
 . 
 unavailable 
 (. 
 appleIntelligenceNotEnabled 
 ): 
  
 // The user has not enabled Apple Intelligence in Settings. 
  
 break 
 case 
  
 . 
 unavailable 
 (. 
 modelNotReady 
 ): 
  
 // The model is still being downloaded. 
  
 break 
 case 
  
 let 
  
 . 
 unavailable 
 ( 
 reason 
 ): 
  
 // The model is unavailable due to the specified `reason`. 
  
 break 
 } 
 

Determine whether on-device or in-cloud inference was used

If you use a HybridModel (and set both primary and secondary models), then it might be helpful to know which model was used for a given request. This information is provided by the modelVersion property of rawResponse in each response.

When you access this property, the returned value will be one of the following:

  • Cloud-hosted model used: the model name, for example gemini-3.1-flash-lite
  • On-device model used: apple-foundation-models-system-language-model
  // let response = try await session.respond(to: ... 
 print 
 ( 
 "You used: 
 \( 
 response 
 . 
 rawResponse 
 . 
 modelVersion 
 ) 
 " 
 ) 
 print 
 ( 
 response 
 . 
 content 
 ) 
 

Use model configuration to control responses

In each request to a model, you can send along a model configuration to control how the model generates a response. Cloud-hosted models and on-device models offer different configuration options ( cloud vs on-device parameters).

These options are configured for each request to the model.

Here's an example that sets the configurations for the cloud-hosted and on-device models for hybrid inference:

  // ... 
 let 
  
 response 
  
 = 
  
 try 
  
 await 
  
 session 
 . 
 respond 
 ( 
  
 to 
 : 
  
 "Why is the sky blue?" 
 , 
  
 options 
 : 
  
 . 
 hybrid 
 ( 
  
 // Config for cloud-hosted model 
  
 gemini 
 : 
  
 GenerationConfig 
 ( 
  
 temperature 
 : 
  
 0.8 
 , 
  
 topP 
 : 
  
 0.9 
 , 
  
 thinkingConfig 
 : 
  
 ThinkingConfig 
 ( 
 thinkingLevel 
 : 
  
 . 
 high 
 ) 
  
 ), 
  
 // Config for on-device model 
  
 foundationModels 
 : 
  
 FirebaseAI 
 . 
 GenerationOptions 
 ( 
  
 sampling 
 : 
  
 . 
 random 
 ( 
 probabilityThreshold 
 : 
  
 0.9 
 ), 
  
 temperature 
 : 
  
 0.8 
  
 ) 
  
 ) 
 ) 
 // ... 
 


Give feedback about your experience with Firebase AI Logic


Create a Mobile Website
View Site in Mobile | Classic
Share by: