Gemini 3 Pro & Flash, Gemini 3 Pro Image (nano banana pro), and the latest Gemini Live API native audio models are now available to use with Firebase AI Logic on all platforms!

Capabilities of the Live API

This page describes the capabilities of the Live API when you use it via Firebase AI Logic , including:

Supported input modalities, including:
Advanced capabilities, including mid-session updates:
- Adding incremental content
- Updating system instructions
Lists of not supported features , many of which are coming soon!

You can also customize your implementation by using various configuration options , like adding transcription or setting the response voice.

Input modalities

This section describes how to send various types of inputs to a Live API model. Native audio models always require audio input (along with optional additional modalities of text or video input), and they always respond with audio output .

Audio input
Text + audio input
Video + audio input

Stream audio input

Click your Gemini API provider to view provider-specific content and code on this page.

The most common capability of the Live API is bidirectional audio streaming, meaning real-time streaming of both audio input and output.

The Live API supports the following audio formats:

Input audio format:Raw 16 bit PCM audio at 16kHz little-endian
Output audio format:Raw 16 bit PCM audio at 24kHz little-endian
Supported MIME types: audio/x-aac , audio/flac , audio/mp3 , audio/m4a , audio/mpeg , audio/mpga , audio/mp4 , audio/ogg , audio/pcm , audio/wav , audio/webm

To convey the sample rate of input audio, set the MIME type of each audio-containing Blob to a value like audio/pcm;rate=16000 .

Swift

To use the Live API , create a LiveModel instance and set the response modality to audio .

  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 let 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 [. 
 audio 
 ] 
  
 ) 
 ) 
 do 
  
 { 
  
 let 
  
 session 
  
 = 
  
 try 
  
 await 
  
 liveModel 
 . 
 connect 
 () 
  
 // Load the audio file, or tap a microphone 
  
 guard 
  
 let 
  
 audioFile 
  
 = 
  
 NSDataAsset 
 ( 
 name 
 : 
  
 "audio.pcm" 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Failed to load audio file" 
 ) 
  
 } 
  
 // Provide the audio data 
  
 await 
  
 session 
 . 
 sendAudioRealtime 
 ( 
 audioFile 
 . 
 data 
 ) 
  
 var 
  
 outputText 
  
 = 
  
 "" 
  
 for 
  
 try 
  
 await 
  
 message 
  
 in 
  
 session 
 . 
 responses 
  
 { 
  
 if 
  
 case 
  
 let 
  
 . 
 content 
 ( 
 content 
 ) 
  
 = 
  
 message 
 . 
 payload 
  
 { 
  
 content 
 . 
 modelTurn 
 ?. 
 parts 
 . 
 forEach 
  
 { 
  
 part 
  
 in 
  
 if 
  
 let 
  
 part 
  
 = 
  
 part 
  
 as 
 ? 
  
 InlineDataPart 
 , 
  
 part 
 . 
 mimeType 
 . 
 starts 
 ( 
 with 
 : 
  
 "audio/pcm" 
 ) 
  
 { 
  
 // Handle 16bit pcm audio data at 24khz 
  
 playAudio 
 ( 
 part 
 . 
 data 
 ) 
  
 } 
  
 } 
  
 // Optional: if you don't require to send more requests. 
  
 if 
  
 content 
 . 
 isTurnComplete 
  
 { 
  
 await 
  
 session 
 . 
 close 
 () 
  
 } 
  
 } 
  
 } 
 } 
  
 catch 
  
 { 
  
 fatalError 
 ( 
 error 
 . 
 localizedDescription 
 ) 
 }

Kotlin

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 val 
  
 liveModel 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
  
 = 
  
 liveGenerationConfig 
  
 { 
  
 responseModality 
  
 = 
  
 ResponseModality 
 . 
 AUDIO 
  
 } 
 ) 
 val 
  
 session 
  
 = 
  
 liveModel 
 . 
 connect 
 () 
 // This is the recommended approach. 
 // However, you can create your own recorder and handle the stream. 
 session 
 . 
 startAudioConversation 
 ()

Java

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  ExecutorService 
  
 executor 
  
 = 
  
 Executors 
 . 
 newFixedThreadPool 
 ( 
 1 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 LiveGenerativeModel 
  
 lm 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 new 
  
 LiveGenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModality 
 ( 
 ResponseModality 
 . 
 AUDIO 
 ) 
  
 . 
 build 
 () 
 ); 
 LiveModelFutures 
  
 liveModel 
  
 = 
  
 LiveModelFutures 
 . 
 from 
 ( 
 lm 
 ); 
 ListenableFuture<LiveSession> 
  
 sessionFuture 
  
 = 
  
 liveModel 
 . 
 connect 
 (); 
 Futures 
 . 
 addCallback 
 ( 
 sessionFuture 
 , 
  
 new 
  
 FutureCallback<LiveSession> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 LiveSession 
  
 ses 
 ) 
  
 { 
  
 LiveSessionFutures 
  
 session 
  
 = 
  
 LiveSessionFutures 
 . 
 from 
 ( 
 ses 
 ); 
  
 session 
 . 
 startAudioConversation 
 (); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 // Handle exceptions 
  
 } 
 }, 
  
 executor 
 );

Web

To use the Live API , create a LiveGenerativeModel instance and set the response modality to AUDIO .

  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getLiveGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `LiveGenerativeModel` instance with a model that supports the Live API 
 const 
  
 liveModel 
  
 = 
  
 getLiveGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 AUDIO 
 ], 
  
 }, 
 }); 
 const 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
 // Start the audio conversation 
 const 
  
 audioConversationController 
  
 = 
  
 await 
  
 startAudioConversation 
 ( 
 session 
 ); 
 // ... Later, to stop the audio conversation 
 // await audioConversationController.stop()

Dart

To use the Live API , create a LiveGenerativeModel instance and set the response modality to audio .

  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 import 
  
 'package:your_audio_recorder_package/your_audio_recorder_package.dart' 
 ; 
 late 
  
 LiveModelSession 
  
 _session 
 ; 
 final 
  
 _audioRecorder 
  
 = 
  
 YourAudioRecorder 
 (); 
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveGenerativeModel` instance with a model that supports the Live API 
 final 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 liveGenerativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-native-audio-preview-12-2025' 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig: 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 audio 
 ], 
  
 ), 
 ); 
 _session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
 final 
  
 audioRecordStream 
  
 = 
  
 _audioRecorder 
 . 
 startRecordingStream 
 (); 
 // Map the Uint8List stream to InlineDataPart stream 
 final 
  
 mediaChunkStream 
  
 = 
  
 audioRecordStream 
 . 
 map 
 (( 
 data 
 ) 
  
 { 
  
 return 
  
 InlineDataPart 
 ( 
 'audio/pcm' 
 , 
  
 data 
 ); 
 }); 
 await 
  
 _session 
 . 
 startMediaStream 
 ( 
 mediaChunkStream 
 ); 
 // In a separate thread, receive the audio response from the model 
 await 
  
 for 
  
 ( 
 final 
  
 message 
  
 in 
  
 _session 
 . 
 receive 
 ()) 
  
 { 
  
 // Process the received message 
 }

Unity

To use the Live API , create a LiveModel instance and set the response modality to Audio .

  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 async 
  
 Task 
  
 SendTextReceiveAudio 
 () 
  
 { 
  
 // Initialize the Gemini Developer API backend service 
  
 // Create a `LiveModel` instance with a model that supports the Live API 
  
 var 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetLiveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig 
 : 
  
 new 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Audio 
  
 }) 
  
 ); 
  
 LiveSession 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 ConnectAsync 
 (); 
  
 // Start a coroutine to send audio from the Microphone 
  
 var 
  
 recordingCoroutine 
  
 = 
  
 StartCoroutine 
 ( 
 SendAudio 
 ( 
 session 
 )); 
  
 // Start receiving the response 
  
 await 
  
 ReceiveAudio 
 ( 
 session 
 ); 
 } 
 IEnumerator 
  
 SendAudio 
 ( 
 LiveSession 
  
 liveSession 
 ) 
  
 { 
  
 string 
  
 microphoneDeviceName 
  
 = 
  
 null 
 ; 
  
 int 
  
 recordingFrequency 
  
 = 
  
 16000 
 ; 
  
 int 
  
 recordingBufferSeconds 
  
 = 
  
 2 
 ; 
  
 var 
  
 recordingClip 
  
 = 
  
 Microphone 
 . 
 Start 
 ( 
 microphoneDeviceName 
 , 
  
 true 
 , 
  
 recordingBufferSeconds 
 , 
  
 recordingFrequency 
 ); 
  
 int 
  
 lastSamplePosition 
  
 = 
  
 0 
 ; 
  
 while 
  
 ( 
 true 
 ) 
  
 { 
  
 if 
  
 ( 
 ! 
 Microphone 
 . 
 IsRecording 
 ( 
 microphoneDeviceName 
 )) 
  
 { 
  
 yield 
  
 break 
 ; 
  
 } 
  
 int 
  
 currentSamplePosition 
  
 = 
  
 Microphone 
 . 
 GetPosition 
 ( 
 microphoneDeviceName 
 ); 
  
 if 
  
 ( 
 currentSamplePosition 
  
 != 
  
 lastSamplePosition 
 ) 
  
 { 
  
 // The Microphone uses a circular buffer, so we need to check if the 
  
 // current position wrapped around to the beginning, and handle it 
  
 // accordingly. 
  
 int 
  
 sampleCount 
 ; 
  
 if 
  
 ( 
 currentSamplePosition 
  
 > 
  
 lastSamplePosition 
 ) 
  
 { 
  
 sampleCount 
  
 = 
  
 currentSamplePosition 
  
 - 
  
 lastSamplePosition 
 ; 
  
 } 
  
 else 
  
 { 
  
 sampleCount 
  
 = 
  
 recordingClip 
 . 
 samples 
  
 - 
  
 lastSamplePosition 
  
 + 
  
 currentSamplePosition 
 ; 
  
 } 
  
 if 
  
 ( 
 sampleCount 
  
 > 
  
 0 
 ) 
  
 { 
  
 // Get the audio chunk 
  
 float 
 [] 
  
 samples 
  
 = 
  
 new 
  
 float 
 [ 
 sampleCount 
 ]; 
  
 recordingClip 
 . 
 GetData 
 ( 
 samples 
 , 
  
 lastSamplePosition 
 ); 
  
 // Send the data, discarding the resulting Task to avoid the warning 
  
 _ 
  
 = 
  
 liveSession 
 . 
 SendAudioAsync 
 ( 
 samples 
 ); 
  
 lastSamplePosition 
  
 = 
  
 currentSamplePosition 
 ; 
  
 } 
  
 } 
  
 // Wait for a short delay before reading the next sample from the Microphone 
  
 const 
  
 float 
  
 MicrophoneReadDelay 
  
 = 
  
 0.5f 
 ; 
  
 yield 
  
 return 
  
 new 
  
 WaitForSeconds 
 ( 
 MicrophoneReadDelay 
 ); 
  
 } 
 } 
 Queue 
   
 audioBuffer 
  
 = 
  
 new 
 (); 
 async 
  
 Task 
  
 ReceiveAudio 
 ( 
 LiveSession 
  
 liveSession 
 ) 
  
 { 
  
 int 
  
 sampleRate 
  
 = 
  
 24000 
 ; 
  
 int 
  
 channelCount 
  
 = 
  
 1 
 ; 
  
 // Create a looping AudioClip to fill with the received audio data 
  
 int 
  
 bufferSamples 
  
 = 
  
 ( 
 int 
 )( 
 sampleRate 
  
 * 
  
 channelCount 
 ); 
  
 AudioClip 
  
 clip 
  
 = 
  
 AudioClip 
 . 
 Create 
 ( 
 "StreamingPCM" 
 , 
  
 bufferSamples 
 , 
  
 channelCount 
 , 
  
 sampleRate 
 , 
  
 true 
 , 
  
 OnAudioRead 
 ); 
  
 // Attach the clip to an AudioSource and start playing it 
  
 AudioSource 
  
 audioSource 
  
 = 
  
 GetComponent 
  (); 
  
 audioSource 
 . 
 clip 
  
 = 
  
 clip 
 ; 
  
 audioSource 
 . 
 loop 
  
 = 
  
 true 
 ; 
  
 audioSource 
 . 
 Play 
 (); 
  
 // Start receiving the response 
  
 await 
  
 foreach 
  
 ( 
 var 
  
 message 
  
 in 
  
 liveSession 
 . 
 ReceiveAsync 
 ()) 
  
 { 
  
 // Process the received message 
  
 foreach 
  
 ( 
 float 
 [] 
  
 pcmData 
  
 in 
  
 message 
 . 
 AudioAsFloat 
 ) 
  
 { 
  
 lock 
  
 ( 
 audioBuffer 
 ) 
  
 { 
  
 foreach 
  
 ( 
 float 
  
 sample 
  
 in 
  
 pcmData 
 ) 
  
 { 
  
 audioBuffer 
 . 
 Enqueue 
 ( 
 sample 
 ); 
  
 } 
  
 } 
  
 } 
  
 } 
 } 
 // This method is called by the AudioClip to load audio data. 
 private 
  
 void 
  
 OnAudioRead 
 ( 
 float 
 [] 
  
 data 
 ) 
  
 { 
  
 int 
  
 samplesToProvide 
  
 = 
  
 data 
 . 
 Length 
 ; 
  
 int 
  
 samplesProvided 
  
 = 
  
 0 
 ; 
  
 lock 
 ( 
 audioBuffer 
 ) 
  
 { 
  
 while 
  
 ( 
 samplesProvided 
  
 < 
  
 samplesToProvide 
  
 && 
  
 audioBuffer 
 . 
 Count 
  
 > 
  
 0 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 audioBuffer 
 . 
 Dequeue 
 (); 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
  
 } 
  
 while 
  
 ( 
 samplesProvided 
  
 < 
  
 samplesToProvide 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 0.0f 
 ; 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
 }

Stream text + audio input

Click your Gemini API provider to view provider-specific content and code on this page.

If needed, you can send text input along with the audio input and receive streamed audio output.

Swift

To use the Live API , create a LiveModel instance and set the response modality to audio .

  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 let 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 [. 
 audio 
 ] 
  
 ) 
 ) 
 do 
  
 { 
  
 let 
  
 session 
  
 = 
  
 try 
  
 await 
  
 liveModel 
 . 
 connect 
 () 
  
 // Provide a text prompt 
  
 let 
  
 text 
  
 = 
  
 "tell a short story" 
  
 await 
  
 session 
 . 
 sendTextRealtime 
 ( 
 text 
 ) 
  
 var 
  
 outputText 
  
 = 
  
 "" 
  
 for 
  
 try 
  
 await 
  
 message 
  
 in 
  
 session 
 . 
 responses 
  
 { 
  
 if 
  
 case 
  
 let 
  
 . 
 content 
 ( 
 content 
 ) 
  
 = 
  
 message 
 . 
 payload 
  
 { 
  
 content 
 . 
 modelTurn 
 ?. 
 parts 
 . 
 forEach 
  
 { 
  
 part 
  
 in 
  
 if 
  
 let 
  
 part 
  
 = 
  
 part 
  
 as 
 ? 
  
 InlineDataPart 
 , 
  
 part 
 . 
 mimeType 
 . 
 starts 
 ( 
 with 
 : 
  
 "audio/pcm" 
 ) 
  
 { 
  
 // Handle 16bit pcm audio data at 24khz 
  
 playAudio 
 ( 
 part 
 . 
 data 
 ) 
  
 } 
  
 } 
  
 // Optional: if you don't require to send more requests. 
  
 if 
  
 content 
 . 
 isTurnComplete 
  
 { 
  
 await 
  
 session 
 . 
 close 
 () 
  
 } 
  
 } 
  
 } 
 } 
  
 catch 
  
 { 
  
 fatalError 
 ( 
 error 
 . 
 localizedDescription 
 ) 
 }

Kotlin

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 val 
  
 liveModel 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
  
 = 
  
 liveGenerationConfig 
  
 { 
  
 responseModality 
  
 = 
  
 ResponseModality 
 . 
 AUDIO 
  
 } 
 ) 
 val 
  
 session 
  
 = 
  
 liveModel 
 . 
 connect 
 () 
 // Provide a text prompt 
 val 
  
 text 
  
 = 
  
 "tell a short story" 
 session 
 . 
 send 
 ( 
 text 
 ) 
 session 
 . 
 receive 
 (). 
 collect 
  
 { 
  
 if 
 ( 
 it 
 . 
 turnComplete 
 ) 
  
 { 
  
 // Optional: if you don't require to send more requests. 
  
 session 
 . 
 stopReceiving 
 (); 
  
 } 
  
 // Handle 16bit pcm audio data at 24khz 
  
 playAudio 
 ( 
 it 
 . 
 data 
 ) 
 }

Java

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  ExecutorService 
  
 executor 
  
 = 
  
 Executors 
 . 
 newFixedThreadPool 
 ( 
 1 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 LiveGenerativeModel 
  
 lm 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with text 
  
 new 
  
 LiveGenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModality 
 ( 
 ResponseModality 
 . 
 AUDIO 
 ) 
  
 . 
 build 
 () 
 ); 
 LiveModelFutures 
  
 model 
  
 = 
  
 LiveModelFutures 
 . 
 from 
 ( 
 lm 
 ); 
 ListenableFuture<LiveSession> 
  
 sessionFuture 
  
 = 
  
 model 
 . 
 connect 
 (); 
 class 
 LiveContentResponseSubscriber 
  
 implements 
  
 Subscriber<LiveContentResponse> 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSubscribe 
 ( 
 Subscription 
  
 s 
 ) 
  
 { 
  
 s 
 . 
 request 
 ( 
 Long 
 . 
 MAX_VALUE 
 ); 
  
 // Request an unlimited number of items 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onNext 
 ( 
 LiveContentResponse 
  
 liveContentResponse 
 ) 
  
 { 
  
 // Handle 16bit pcm audio data at 24khz 
  
 liveContentResponse 
 . 
 getData 
 (); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onError 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
 "Error: " 
  
 + 
  
 t 
 . 
 getMessage 
 ()); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onComplete 
 () 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Done receiving messages!" 
 ); 
  
 } 
 } 
 Futures 
 . 
 addCallback 
 ( 
 sessionFuture 
 , 
  
 new 
  
 FutureCallback<LiveSession> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 LiveSession 
  
 ses 
 ) 
  
 { 
  
 LiveSessionFutures 
  
 session 
  
 = 
  
 LiveSessionFutures 
 . 
 from 
 ( 
 ses 
 ); 
  
 // Provide a text prompt 
  
 String 
  
 text 
  
 = 
  
 "tell me a short story?" 
 ; 
  
 session 
 . 
 send 
 ( 
 text 
 ); 
  
 Publisher<LiveContentResponse> 
  
 publisher 
  
 = 
  
 session 
 . 
 receive 
 (); 
  
 publisher 
 . 
 subscribe 
 ( 
 new 
  
 LiveContentResponseSubscriber 
 ()); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 // Handle exceptions 
  
 } 
 }, 
  
 executor 
 );

Web

To use the Live API , create a LiveGenerativeModel instance and set the response modality to AUDIO .

  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getLiveGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `LiveGenerativeModel` instance with a model that supports the Live API 
 const 
  
 liveModel 
  
 = 
  
 getLiveGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 AUDIO 
 ], 
  
 }, 
 }); 
 const 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
 // Provide a text prompt 
 const 
  
 prompt 
  
 = 
  
 "tell a short story" 
 ; 
 session 
 . 
 send 
 ( 
 prompt 
 ); 
 // Handle the model's audio output 
 const 
  
 messages 
  
 = 
  
 session 
 . 
 receive 
 (); 
 for 
  
 await 
  
 ( 
 const 
  
 message 
  
 of 
  
 messages 
 ) 
  
 { 
  
 switch 
  
 ( 
 message 
 . 
 type 
 ) 
  
 { 
  
 case 
  
 "serverContent" 
 : 
  
 if 
  
 ( 
 message 
 . 
 turnComplete 
 ) 
  
 { 
  
 // TODO(developer): Handle turn completion 
  
 } 
  
 else 
  
 if 
  
 ( 
 message 
 . 
 interrupted 
 ) 
  
 { 
  
 // TODO(developer): Handle the interruption 
  
 break 
 ; 
  
 } 
  
 else 
  
 if 
  
 ( 
 message 
 . 
 modelTurn 
 ) 
  
 { 
  
 const 
  
 parts 
  
 = 
  
 message 
 . 
 modelTurn 
 ? 
 . 
 parts 
 ; 
  
 parts 
 ? 
 . 
 forEach 
 (( 
 part 
 ) 
  
 => 
  
 { 
  
 if 
  
 ( 
 part 
 . 
 inlineData 
 ) 
  
 { 
  
 // TODO(developer): Play the audio chunk 
  
 } 
  
 }); 
  
 } 
  
 break 
 ; 
  
 case 
  
 "toolCall" 
 : 
  
 // Ignore 
  
 case 
  
 "toolCallCancellation" 
 : 
  
 // Ignore 
  
 } 
 }

Dart

To use the Live API , create a LiveGenerativeModel instance and set the response modality to audio .

  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 import 
  
 'dart:async' 
 ; 
 import 
  
 'dart:typed_data' 
 ; 
 late 
  
 LiveModelSession 
  
 _session 
 ; 
 Future<Stream<Uint8List> 
>  
 textToAudio 
 ( 
 String 
  
 textPrompt 
 ) 
  
 async 
  
 { 
  
 WidgetsFlutterBinding 
 . 
 ensureInitialized 
 (); 
  
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
  
 ); 
  
 // Initialize the Gemini Developer API backend service 
  
 // Create a `liveGenerativeModel` instance with a model that supports the Live API 
  
 final 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 liveGenerativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-native-audio-preview-12-2025' 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig: 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 audio 
 ], 
  
 ), 
  
 ); 
  
 _session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
  
 final 
  
 prompt 
  
 = 
  
 Content 
 . 
 text 
 ( 
 textPrompt 
 ); 
  
 await 
  
 _session 
 . 
 send 
 ( 
 input: 
  
 prompt 
 ); 
  
 return 
  
 _session 
 . 
 receive 
 (). 
 asyncMap 
 (( 
 response 
 ) 
  
 async 
  
 { 
  
 if 
  
 ( 
 response 
  
 is 
  
 LiveServerContent 
 && 
 response 
 . 
 modelTurn 
 ? 
 . 
 parts 
  
 != 
  
 null 
 ) 
  
 { 
  
 for 
  
 ( 
 final 
  
 part 
  
 in 
  
 response 
 . 
 modelTurn 
 ! 
 . 
 parts 
 ) 
  
 { 
  
 if 
  
 ( 
 part 
  
 is 
  
 InlineDataPart 
 ) 
  
 { 
  
 return 
  
 part 
 . 
 bytes 
 ; 
  
 } 
  
 } 
  
 } 
  
 throw 
  
 Exception 
 ( 
 'Audio data not found' 
 ); 
  
 }); 
 } 
 Future<void> 
  
 main 
 () 
  
 async 
  
 { 
  
 try 
  
 { 
  
 final 
  
 audioStream 
  
 = 
  
 await 
  
 textToAudio 
 ( 
 'Convert this text to audio.' 
 ); 
  
 await 
  
 for 
  
 ( 
 final 
  
 audioData 
  
 in 
  
 audioStream 
 ) 
  
 { 
  
 // Process the audio data (e.g., play it using an audio player package) 
  
 print 
 ( 
 'Received audio data: 
 ${ 
 audioData 
 . 
 length 
 } 
 bytes' 
 ); 
  
 // Example using flutter_sound (replace with your chosen package): 
  
 // await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData); 
  
 } 
  
 } 
  
 catch 
  
 ( 
 e 
 ) 
  
 { 
  
 print 
 ( 
 'Error: 
 $ 
 e 
 ' 
 ); 
  
 } 
 }

Unity

To use the Live API , create a LiveModel instance and set the response modality to Audio .

  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 async 
  
 Task 
  
 SendTextReceiveAudio 
 () 
  
 { 
  
 // Initialize the Gemini Developer API backend service 
  
 // Create a `LiveModel` instance with a model that supports the Live API 
  
 var 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetLiveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig 
 : 
  
 new 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Audio 
  
 }) 
  
 ); 
  
 LiveSession 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 ConnectAsync 
 (); 
  
 // Provide a text prompt 
  
 var 
  
 prompt 
  
 = 
  
 ModelContent 
 . 
 Text 
 ( 
 "Convert this text to audio." 
 ); 
  
 await 
  
 session 
 . 
 SendAsync 
 ( 
 content 
 : 
  
 prompt 
 , 
  
 turnComplete 
 : 
  
 true 
 ); 
  
 // Start receiving the response 
  
 await 
  
 ReceiveAudio 
 ( 
 session 
 ); 
 } 
 Queue<float> 
  
 audioBuffer 
  
 = 
  
 new 
 (); 
 async 
  
 Task 
  
 ReceiveAudio 
 ( 
 LiveSession 
  
 session 
 ) 
  
 { 
  
 int 
  
 sampleRate 
  
 = 
  
 24000 
 ; 
  
 int 
  
 channelCount 
  
 = 
  
 1 
 ; 
  
 // Create a looping AudioClip to fill with the received audio data 
  
 int 
  
 bufferSamples 
  
 = 
  
 ( 
 int 
 )( 
 sampleRate 
  
 * 
  
 channelCount 
 ); 
  
 AudioClip 
  
 clip 
  
 = 
  
 AudioClip 
 . 
 Create 
 ( 
 "StreamingPCM" 
 , 
  
 bufferSamples 
 , 
  
 channelCount 
 , 
  
 sampleRate 
 , 
  
 true 
 , 
  
 OnAudioRead 
 ); 
  
 // Attach the clip to an AudioSource and start playing it 
  
 AudioSource 
  
 audioSource 
  
 = 
  
 GetComponent<AudioSource> 
 (); 
  
 audioSource 
 . 
 clip 
  
 = 
  
 clip 
 ; 
  
 audioSource 
 . 
 loop 
  
 = 
  
 true 
 ; 
  
 audioSource 
 . 
 Play 
 (); 
  
 // Start receiving the response 
  
 await 
  
 foreach 
  
 ( 
 var 
  
 message 
  
 in 
  
 session 
 . 
 ReceiveAsync 
 ()) 
  
 { 
  
 // Process the received message 
  
 foreach 
  
 ( 
 float 
 [] 
  
 pcmData 
  
 in 
  
 message 
 . 
 AudioAsFloat 
 ) 
  
 { 
  
 lock 
  
 ( 
 audioBuffer 
 ) 
  
 { 
  
 foreach 
  
 ( 
 float 
  
 sample 
  
 in 
  
 pcmData 
 ) 
  
 { 
  
 audioBuffer 
 . 
 Enqueue 
 ( 
 sample 
 ); 
  
 } 
  
 } 
  
 } 
  
 } 
 } 
 // This method is called by the AudioClip to load audio data. 
 private 
  
 void 
  
 OnAudioRead 
 ( 
 float 
 [] 
  
 data 
 ) 
  
 { 
  
 int 
  
 samplesToProvide 
  
 = 
  
 data 
 . 
 Length 
 ; 
  
 int 
  
 samplesProvided 
  
 = 
  
 0 
 ; 
  
 lock 
 ( 
 audioBuffer 
 ) 
  
 { 
  
 while 
  
 ( 
 samplesProvided 
 < 
 samplesToProvide 
 && 
 audioBuffer 
 . 
 Count 
 > 
 0 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 audioBuffer 
 . 
 Dequeue 
 (); 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
  
 } 
  
 while 
  
 ( 
 samplesProvided 
 < 
 samplesToProvide 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 0.0f 
 ; 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
 }

Note that you can also send text as incremental content updates during an active session.

Stream video + audio input

Providing input video content provides visual context for the input audio.

The Live API expects a sequence of discrete image frames and supports video frames input at 1 frame per second (FPS).

Recommended input: native 768x768 resolution at 1 FPS.
Supported MIME types: video/x-flv , video/quicktime , video/mpeg , video/mpegs , video/mpg , video/mp4 , video/webm , video/wmv , video/3gpp

Streaming video + audio input is a more advanced implementation, so check out a sample app to learn how to implement this capability: Swift- coming soon! | Android- sample app | Web- coming soon! | Flutter- sample app | Unity- coming soon!

Advanced capabilities

The Live API models support the following advanced capabilities for mid-session updates :

Add incremental content updates
Update system instructions (for Vertex AI Gemini API only)

Add incremental content updates

You can add incremental updates during an active session. Use this to send text input, establish session context, or restore session context.

For longer contexts, we recommend providing a single message summary to free up the context window for subsequent interactions.
For short contexts, you can send turn-by-turn interactions to represent the exact sequence of events, like the snippet below.

Swift

  // Define initial turns (history/context). 
 let 
  
 turns 
 : 
  
 [ 
 ModelContent 
 ] 
  
 = 
  
 [ 
  
 ModelContent 
 ( 
 role 
 : 
  
 "user" 
 , 
  
 parts 
 : 
  
 [ 
 TextPart 
 ( 
 "What is the capital of France?" 
 )]), 
  
 ModelContent 
 ( 
 role 
 : 
  
 "model" 
 , 
  
 parts 
 : 
  
 [ 
 TextPart 
 ( 
 "Paris" 
 )]), 
 ] 
 // Send history, keeping the conversational turn OPEN (false). 
 await 
  
 session 
 . 
 sendContent 
 ( 
 turns 
 , 
  
 turnComplete 
 : 
  
 false 
 ) 
 // Define the new user query. 
 let 
  
 newTurn 
 : 
  
 [ 
 ModelContent 
 ] 
  
 = 
  
 [ 
  
 ModelContent 
 ( 
 role 
 : 
  
 "user" 
 , 
  
 parts 
 : 
  
 [ 
 TextPart 
 ( 
 "What is the capital of Germany?" 
 )]), 
 ] 
 // Send the final query, CLOSING the turn (true) to trigger the model response. 
 await 
  
 session 
 . 
 sendContent 
 ( 
 newTurn 
 , 
  
 turnComplete 
 : 
  
 true 
 )

Kotlin

  Not 
  
 yet 
  
 supported 
  
 for 
  
 Android 
  
 apps 
  
 - 
  
 check 
  
 back 
  
 soon 
 !

Java

  Not 
  
 yet 
  
 supported 
  
 for 
  
 Android 
  
 apps 
  
 - 
  
 check 
  
 back 
  
 soon 
 !

Web

  const 
  
 turns 
  
 = 
  
 [{ 
  
 text 
 : 
  
 "Hello from the user!" 
  
 }]; 
 await 
  
 session 
 . 
 send 
 ( 
  
 turns 
 , 
  
 false 
  
 // turnComplete: false 
 ); 
 console 
 . 
 log 
 ( 
 "Sent history. Waiting for next input..." 
 ); 
 // Define the new user query. 
 const 
  
 newTurn 
  
 [{ 
  
 text 
 : 
  
 "And what is the capital of Germany?" 
  
 }]; 
 // Send the final query, CLOSING the turn (true) to trigger the model response. 
 await 
  
 session 
 . 
 send 
 ( 
  
 newTurn 
 , 
  
 true 
  
 // turnComplete: true 
 ); 
 console 
 . 
 log 
 ( 
 "Sent final query. Model response expected now." 
 );

Dart

  // Define initial turns (history/context). 
 final 
  
 List 
   
 turns 
  
 = 
  
 [ 
  
 Content 
 ( 
  
 "user" 
 , 
  
 [ 
 Part 
 . 
 text 
 ( 
 "What is the capital of France?" 
 )], 
  
 ), 
  
 Content 
 ( 
  
 "model" 
 , 
  
 [ 
 Part 
 . 
 text 
 ( 
 "Paris" 
 )], 
  
 ), 
 ]; 
 // Send history, keeping the conversational turn OPEN (false). 
 await 
  
 session 
 . 
 send 
 ( 
  
 input: 
  
 turns 
 , 
  
 turnComplete: 
  
 false 
 , 
 ); 
 // Define the new user query. 
 final 
  
 List 
   
 newTurn 
  
 = 
  
 [ 
  
 Content 
 ( 
  
 "user" 
 , 
  
 [ 
 Part 
 . 
 text 
 ( 
 "What is the capital of Germany?" 
 )], 
  
 ), 
 ]; 
 // Send the final query, CLOSING the turn (true) to trigger the model response. 
 await 
  
 session 
 . 
 send 
 ( 
  
 input: 
  
 newTurn 
 , 
  
 turnComplete: 
  
 true 
 , 
 );

Unity

  // Define initial turns (history/context). 
 List 
   
 turns 
  
 = 
  
 new 
  
 List 
   
 { 
  
 new 
  
 ModelContent 
 ( 
 "user" 
 , 
  
 new 
  
 ModelContent 
 . 
 TextPart 
 ( 
 "What is the capital of France?" 
 ) 
  
 ), 
  
 new 
  
 ModelContent 
 ( 
 "model" 
 , 
  
 new 
  
 ModelContent 
 . 
 TextPart 
 ( 
 "Paris" 
 ) 
  
 ), 
 }; 
 // Send history, keeping the conversational turn OPEN (false). 
 foreach 
  
 ( 
 ModelContent 
  
 turn 
  
 in 
  
 turns 
 ) 
 { 
  
 await 
  
 session 
 . 
 SendAsync 
 ( 
  
 content 
 : 
  
 turn 
 , 
  
 turnComplete 
 : 
  
 false 
  
 ); 
 } 
 // Define the new user query. 
 ModelContent 
  
 newTurn 
  
 = 
  
 ModelContent 
 . 
 Text 
 ( 
 "What is the capital of Germany?" 
 ); 
 // Send the final query, CLOSING the turn (true) to trigger the model response. 
 await 
  
 session 
 . 
 SendAsync 
 ( 
  
 content 
 : 
  
 newTurn 
 , 
  
 turnComplete 
 : 
  
 true 
 );

Update system instructions mid-session

Only available when using the Vertex AI Gemini API as your API provider.

You can update the system instructions during an active session. Use this to adapt the model's responses, for example to change the response language or modify the tone.

To update the system instructions mid-session, you can send text content with the system role. The updated system instructions will remain in effect for the remainder of the session.

Swift

  await 
  
 session 
 . 
 sendContent 
 ( 
  
 [ 
 ModelContent 
 ( 
  
 role 
 : 
  
 "system" 
 , 
  
 parts 
 : 
  
 [ 
 TextPart 
 ( 
 "new system instruction" 
 )] 
  
 )], 
  
 turnComplete 
 : 
  
 false 
 )

Kotlin

  Not 
  
 yet 
  
 supported 
  
 for 
  
 Android 
  
 apps 
  
 - 
  
 check 
  
 back 
  
 soon 
 !

Java

  Not 
  
 yet 
  
 supported 
  
 for 
  
 Android 
  
 apps 
  
 - 
  
 check 
  
 back 
  
 soon 
 !

Web

  Not 
  
 yet 
  
 supported 
  
 for 
  
 Web 
  
 apps 
  
 - 
  
 check 
  
 back 
  
 soon 
 !

Dart

  try 
  
 { 
  
 await 
  
 _session 
 . 
 send 
 ( 
  
 input: 
  
 Content 
 ( 
  
 'system' 
 , 
  
 [ 
 Part 
 . 
 text 
 ( 
 'new system instruction' 
 )], 
  
 ), 
  
 turnComplete: 
  
 false 
 , 
  
 ); 
 } 
  
 catch 
  
 ( 
 e 
 ) 
  
 { 
  
 print 
 ( 
 'Failed to update system instructions: 
 $ 
 e 
 ' 
 ); 
 }

Unity

  try 
 { 
  
 await 
  
 session 
 . 
 SendAsync 
 ( 
  
 content 
 : 
  
 new 
  
 ModelContent 
 ( 
  
 "system" 
 , 
  
 new 
  
 ModelContent 
 . 
 TextPart 
 ( 
 "new system instruction" 
 ) 
  
 ), 
  
 turnComplete 
 : 
  
 false 
  
 ); 
 } 
 catch 
  
 ( 
 Exception 
  
 e 
 ) 
 { 
  
 Debug 
 . 
 LogError 
 ( 
 $"Failed to update system instructions: {e.Message}" 
 ); 
 }

Not supported features

Features not yet supported by Firebase AI Logic when using the Live API , but they're coming soon!
- Handling interruptions
- Session management, including resuming a session across multiple connections, extending the session length, or compressing the context window.
- Disabling and configuring voice activity detection (VAD)
- Setting input media resolution
- Adding a thinking configuration
- Enabling affective dialogue or proactive audio
- Receiving UsageMetadata in the response
Features not supported by Firebase AI Logic when using the Live API , and they're unplanned right now.
- Server prompt templates
- Hybrid or on-device inference
- AI monitoring in the Firebase console

What else can you do?

Customize your implementation by using various configuration options , like adding transcription or setting the response voice.
Supercharge your implementation by giving the model access to tools, like function calling and grounding with Google Search. Official documentation for using tools with the Live API is coming soon!
Learn about limits and specifications , for using the Live API , like session length, rate limits, supported languages, etc.

Capabilities of the Live API Stay organized with collections Save and categorize content based on your preferences.

Input modalities

Stream audio input

Swift

Kotlin

Java

Web

Dart

Unity

Stream text + audio input

Swift

Kotlin

Java

Web

Dart

Unity

Stream video + audio input

Advanced capabilities

Add incremental content updates

Swift

Kotlin

Java

Web

Dart

Unity

Update system instructions mid-session

Swift

Kotlin

Java

Web

Dart

Unity

Not supported features

What else can you do?

Capabilities of the Live API