The latest Gemini models, like Gemini 3.1 Flash Image ( Nano Banana 2 ), are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models will shut down on June 1, 2026 . To avoid service disruption, update to a newer model like gemini-3.1-flash-lite . Learn more.

All Imagen models will shut down on June 24, 2026 . Learn about migrating your apps to use Nano Banana.

Configuration options for hybrid experiences in Apple apps

This page describes the following configuration options for hybrid and on-device experiences:

Set an "inference mode".
Check if the on-device model is available.
Determine whether on-device or in-cloud inference was used.
Use model configuration to control responses (like temperature).

Make sure that you've completed the getting started guide for building hybrid experiences .

Set an "inference mode"

The examples in the getting started guide show how to implement attempting on-device inference first, and then falling back to the cloud-hosted model. This is only one of the available "inference modes" that you can implement.

Hybrid inference

Prefer on-device inference: set primary to a "system" model and secondary to a cloud model.

Attempt to use the on-device model if it's available and supports the type of request. Otherwise, log an error on the device and then automatically fall back to the cloud-hosted model .

  // Imports + initialization of Gemini API backend service 
 // ... 
 // Initialize a cloud model that supports your use case 
 let 
  
 cloudModel 
  
 = 
  
 ai 
 . 
 geminiModel 
 ( 
 name 
 : 
  
 "GEMINI_MODEL_NAME" 
 ) 
 // Initialize an on-device model that supports your use case 
 let 
  
 systemModel 
  
 = 
  
 FirebaseAI 
 . 
 SystemLanguageModel 
 . 
 default 
 // Create a GenerativeModelSession with a hybrid model. 
 // Provide your preferred model as `primary` and your fallback model as `secondary` 
 // Attempt to use the on-device model; otherwise, fall back to the cloud-hosted model. 
 let 
  
 session 
  
 = 
  
 ai 
 . 
 generativeModelSession 
 ( 
  
 model 
 : 
  
 . 
 hybridModel 
 ( 
 primary 
 : 
  
 systemModel 
 , 
  
 secondary 
 : 
  
 cloudModel 
 ) 
 )

Prefer in-cloud inference: set primary to a cloud model and secondary to a "system" model.

Attempt to use the cloud-hosted model if the device is online and if the model is available. If the device is offline, fall back to the on-device model . In all other failure cases, throw an exception .

  // Imports + initialization of Gemini API backend service 
 // ... 
 // Initialize a cloud model that supports your use case 
 let 
  
 cloudModel 
  
 = 
  
 ai 
 . 
 geminiModel 
 ( 
 name 
 : 
  
 " GEMINI_MODEL_NAME 
" 
 ) 
 // Initialize an on-device model that supports your use case 
 let 
  
 systemModel 
  
 = 
  
 FirebaseAI 
 . 
 SystemLanguageModel 
 . 
 default 
 // Create a GenerativeModelSession with a hybrid model. 
 // Provide your preferred model as `primary` and your fallback model as `secondary` 
 // Attempt to use the cloud-hosted model; otherwise, fall back to the on-device model. 
 let 
  
 session 
  
 = 
  
 ai 
 . 
 generativeModelSession 
 ( 
  
 model 
 : 
  
 . 
 hybridModel 
 ( 
 primary 
 : 
  
 cloudModel 
 , 
  
 secondary 
 : 
  
 systemModel 
 ) 
 )

Only on-device or only in-cloud inference

The SDK supports setting only a single model which means the SDK will only attempt either on-device or in-cloud inference. Also, you don't create a HybridModel for this use case. However, for a hybrid experience, you do need to create a HybridModel and set both primary and secondary models (as described above).

Only on-device inference: set model to a "system" model. You don't create a HybridModel for this use case.

Attempt to use the on-device model if it's available and supports the type of request. Otherwise, throw an exception .

  // Imports + initialization of Gemini API backend service 
 // ... 
 // Initialize an on-device model that supports your use case 
 let 
  
 systemModel 
  
 = 
  
 FirebaseAI 
 . 
 SystemLanguageModel 
 . 
 default 
 // Create a GenerativeModelSession with the on-device model. 
 let 
  
 session 
  
 = 
  
 ai 
 . 
 generativeModelSession 
 ( 
  
 model 
 : 
  
 systemModel 
 )

Only in-cloud inference: set model to a cloud model. You don't create a HybridModel for this use case.

Attempt to use the cloud-hosted model if the device is online and if the model is available. Otherwise, throw an exception .

  // Imports + initialization of Gemini API backend service 
 // ... 
 // Initialize a cloud model that supports your use case 
 let 
  
 cloudModel 
  
 = 
  
 ai 
 . 
 geminiModel 
 ( 
 name 
 : 
  
 " GEMINI_MODEL_NAME 
" 
 ) 
 // Create a GenerativeModelSession with a cloud model. 
 let 
  
 session 
  
 = 
  
 ai 
 . 
 generativeModelSession 
 ( 
  
 model 
 : 
  
 cloudModel 
 )

Check if the on-device model is available

Manual checks for on-device availability are only necessary if you want to surface that information to the user or request that end-users take action to download the on-device model. If the on-device model is unavailable – and you've set primary to an on-device model and secondary to a cloud model – then the SDK will automatically fallback to using the cloud-hosted model.

To manually check whether the on-device model is actually usable, inspect the isAvailable property:

  if 
  
 FirebaseAI 
 . 
 SystemLanguageModel 
 . 
 default 
 . 
 isAvailable 
  
 { 
  
 // The on-device model is ready to use. 
 } 
  
 else 
  
 { 
  
 // The on-device model is unavailable. 
 }

To check for specific on-device model availability reasons, inspect the availability property:

  switch 
  
 FirebaseAI 
 . 
 SystemLanguageModel 
 . 
 default 
 . 
 availability 
  
 { 
 case 
  
 . 
 available 
 : 
  
 // The on-device model is ready to use. 
  
 break 
 case 
  
 . 
 unavailable 
 (. 
 deviceNotEligible 
 ): 
  
 // This device does not support Apple Intelligence. 
  
 break 
 case 
  
 . 
 unavailable 
 (. 
 appleIntelligenceNotEnabled 
 ): 
  
 // The user has not enabled Apple Intelligence in Settings. 
  
 break 
 case 
  
 . 
 unavailable 
 (. 
 modelNotReady 
 ): 
  
 // The model is still being downloaded. 
  
 break 
 case 
  
 let 
  
 . 
 unavailable 
 ( 
 reason 
 ): 
  
 // The model is unavailable due to the specified `reason`. 
  
 break 
 }

Determine whether on-device or in-cloud inference was used

If you use a HybridModel (and set both primary and secondary models), then it might be helpful to know which model was used for a given request. This information is provided by the modelVersion property of rawResponse in each response.

When you access this property, the returned value will be one of the following:

Cloud-hosted model used: the model name, for example gemini-3.1-flash-lite
On-device model used: apple-foundation-models-system-language-model

  // let response = try await session.respond(to: ... 
 print 
 ( 
 "You used: 
 \( 
 response 
 . 
 rawResponse 
 . 
 modelVersion 
 ) 
 " 
 ) 
 print 
 ( 
 response 
 . 
 content 
 )

Use model configuration to control responses

In each request to a model, you can send along a model configuration to control how the model generates a response. Cloud-hosted models and on-device models offer different configuration options ( cloud vs on-device parameters).

Cloud-hosted models: set their configuration in a GenerationConfig .
On-device models: set their configuration within FirebaseAI.GenerationOptions .

These options are configured for each request to the model.

Here's an example that sets the configurations for the cloud-hosted and on-device models for hybrid inference:

  // ... 
 let 
  
 response 
  
 = 
  
 try 
  
 await 
  
 session 
 . 
 respond 
 ( 
  
 to 
 : 
  
 "Why is the sky blue?" 
 , 
  
 options 
 : 
  
 . 
 hybrid 
 ( 
  
 // Config for cloud-hosted model 
  
 gemini 
 : 
  
 GenerationConfig 
 ( 
  
 temperature 
 : 
  
 0.8 
 , 
  
 topP 
 : 
  
 0.9 
 , 
  
 thinkingConfig 
 : 
  
 ThinkingConfig 
 ( 
 thinkingLevel 
 : 
  
 . 
 high 
 ) 
  
 ), 
  
 // Config for on-device model 
  
 foundationModels 
 : 
  
 FirebaseAI 
 . 
 GenerationOptions 
 ( 
  
 sampling 
 : 
  
 . 
 random 
 ( 
 probabilityThreshold 
 : 
  
 0.9 
 ), 
  
 temperature 
 : 
  
 0.8 
  
 ) 
  
 ) 
 ) 
 // ...

Give feedback about your experience with Firebase AI Logic