Capabilities of the Live API


This page describes the capabilities of the Live API when you use it via Firebase AI Logic , including:

Visit other pages to learn about customizing your implementation by using various configuration options , like adding transcription or setting the response voice. You can also learn about managing sessions .



Input modalities

This section describes how to send various types of inputs to a Live API model. Native audio models always require audio input (along with optional additional modalities of text or video input), and they always respond with audio output .

Stream audio input

Click your Gemini API provider to view provider-specific content and code on this page.

The most common capability of the Live API is bidirectional audio streaming, meaning real-time streaming of both audio input and output.

The Live API supports the following audio formats:

  • Input audio format:Raw 16 bit PCM audio at 16kHz little-endian
  • Output audio format:Raw 16 bit PCM audio at 24kHz little-endian

  • Supported MIME types: audio/x-aac , audio/flac , audio/mp3 , audio/m4a , audio/mpeg , audio/mpga , audio/mp4 , audio/ogg , audio/pcm , audio/wav , audio/webm

To convey the sample rate of input audio, set the MIME type of each audio-containing Blob to a value like audio/pcm;rate=16000 .

Swift

To use the Live API , create a LiveModel instance and set the response modality to audio .

  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 let 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 [. 
 audio 
 ] 
  
 ) 
 ) 
 do 
  
 { 
  
 let 
  
 session 
  
 = 
  
 try 
  
 await 
  
 liveModel 
 . 
 connect 
 () 
  
 // Load the audio file, or tap a microphone 
  
 guard 
  
 let 
  
 audioFile 
  
 = 
  
 NSDataAsset 
 ( 
 name 
 : 
  
 "audio.pcm" 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Failed to load audio file" 
 ) 
  
 } 
  
 // Provide the audio data 
  
 await 
  
 session 
 . 
 sendAudioRealtime 
 ( 
 audioFile 
 . 
 data 
 ) 
  
 var 
  
 outputText 
  
 = 
  
 "" 
  
 for 
  
 try 
  
 await 
  
 message 
  
 in 
  
 session 
 . 
 responses 
  
 { 
  
 if 
  
 case 
  
 let 
  
 . 
 content 
 ( 
 content 
 ) 
  
 = 
  
 message 
 . 
 payload 
  
 { 
  
 content 
 . 
 modelTurn 
 ?. 
 parts 
 . 
 forEach 
  
 { 
  
 part 
  
 in 
  
 if 
  
 let 
  
 part 
  
 = 
  
 part 
  
 as 
 ? 
  
 InlineDataPart 
 , 
  
 part 
 . 
 mimeType 
 . 
 starts 
 ( 
 with 
 : 
  
 "audio/pcm" 
 ) 
  
 { 
  
 // Handle 16bit pcm audio data at 24khz 
  
 playAudio 
 ( 
 part 
 . 
 data 
 ) 
  
 } 
  
 } 
  
 // Optional: if you don't require to send more requests. 
  
 if 
  
 content 
 . 
 isTurnComplete 
  
 { 
  
 await 
  
 session 
 . 
 close 
 () 
  
 } 
  
 } 
  
 } 
 } 
  
 catch 
  
 { 
  
 fatalError 
 ( 
 error 
 . 
 localizedDescription 
 ) 
 } 
 

Kotlin

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 val 
  
 liveModel 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
  
 = 
  
 liveGenerationConfig 
  
 { 
  
 responseModality 
  
 = 
  
 ResponseModality 
 . 
 AUDIO 
  
 } 
 ) 
 val 
  
 session 
  
 = 
  
 liveModel 
 . 
 connect 
 () 
 // This is the recommended approach. 
 // However, you can create your own recorder and handle the stream. 
 session 
 . 
 startAudioConversation 
 () 
 

Java

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  ExecutorService 
  
 executor 
  
 = 
  
 Executors 
 . 
 newFixedThreadPool 
 ( 
 1 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 LiveGenerativeModel 
  
 lm 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 new 
  
 LiveGenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModality 
 ( 
 ResponseModality 
 . 
 AUDIO 
 ) 
  
 . 
 build 
 () 
 ); 
 LiveModelFutures 
  
 liveModel 
  
 = 
  
 LiveModelFutures 
 . 
 from 
 ( 
 lm 
 ); 
 ListenableFuture<LiveSession> 
  
 sessionFuture 
  
 = 
  
 liveModel 
 . 
 connect 
 (); 
 Futures 
 . 
 addCallback 
 ( 
 sessionFuture 
 , 
  
 new 
  
 FutureCallback<LiveSession> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 LiveSession 
  
 ses 
 ) 
  
 { 
  
 LiveSessionFutures 
  
 session 
  
 = 
  
 LiveSessionFutures 
 . 
 from 
 ( 
 ses 
 ); 
  
 session 
 . 
 startAudioConversation 
 (); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 // Handle exceptions 
  
 } 
 }, 
  
 executor 
 ); 
 

Web

To use the Live API , create a LiveGenerativeModel instance and set the response modality to AUDIO .

  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getLiveGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `LiveGenerativeModel` instance with a model that supports the Live API 
 const 
  
 liveModel 
  
 = 
  
 getLiveGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 AUDIO 
 ], 
  
 }, 
 }); 
 const 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
 // Start the audio conversation 
 const 
  
 audioConversationController 
  
 = 
  
 await 
  
 startAudioConversation 
 ( 
 session 
 ); 
 // ... Later, to stop the audio conversation 
 // await audioConversationController.stop() 
 

Dart

To use the Live API , create a LiveGenerativeModel instance and set the response modality to audio .

  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 import 
  
 'package:your_audio_recorder_package/your_audio_recorder_package.dart' 
 ; 
 late 
  
 LiveModelSession 
  
 _session 
 ; 
 final 
  
 _audioRecorder 
  
 = 
  
 YourAudioRecorder 
 (); 
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveGenerativeModel` instance with a model that supports the Live API 
 final 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 liveGenerativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-native-audio-preview-12-2025' 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig: 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 audio 
 ], 
  
 ), 
 ); 
 _session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
 final 
  
 audioRecordStream 
  
 = 
  
 _audioRecorder 
 . 
 startRecordingStream 
 (); 
 // Map the Uint8List stream to InlineDataPart stream 
 final 
  
 mediaChunkStream 
  
 = 
  
 audioRecordStream 
 . 
 map 
 (( 
 data 
 ) 
  
 { 
  
 return 
  
 InlineDataPart 
 ( 
 'audio/pcm' 
 , 
  
 data 
 ); 
 }); 
 await 
  
 _session 
 . 
 startMediaStream 
 ( 
 mediaChunkStream 
 ); 
 // In a separate thread, receive the audio response from the model 
 await 
  
 for 
  
 ( 
 final 
  
 message 
  
 in 
  
 _session 
 . 
 receive 
 ()) 
  
 { 
  
 // Process the received message 
 } 
 

Unity

To use the Live API , create a LiveModel instance and set the response modality to Audio .

  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 async 
  
 Task 
  
 SendTextReceiveAudio 
 () 
  
 { 
  
 // Initialize the Gemini Developer API backend service 
  
 // Create a `LiveModel` instance with a model that supports the Live API 
  
 var 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetLiveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig 
 : 
  
 new 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Audio 
  
 }) 
  
 ); 
  
 LiveSession 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 ConnectAsync 
 (); 
  
 // Start a coroutine to send audio from the Microphone 
  
 var 
  
 recordingCoroutine 
  
 = 
  
 StartCoroutine 
 ( 
 SendAudio 
 ( 
 session 
 )); 
  
 // Start receiving the response 
  
 await 
  
 ReceiveAudio 
 ( 
 session 
 ); 
 } 
 IEnumerator 
  
 SendAudio 
 ( 
 LiveSession 
  
 liveSession 
 ) 
  
 { 
  
 string 
  
 microphoneDeviceName 
  
 = 
  
 null 
 ; 
  
 int 
  
 recordingFrequency 
  
 = 
  
 16000 
 ; 
  
 int 
  
 recordingBufferSeconds 
  
 = 
  
 2 
 ; 
  
 var 
  
 recordingClip 
  
 = 
  
 Microphone 
 . 
 Start 
 ( 
 microphoneDeviceName 
 , 
  
 true 
 , 
  
 recordingBufferSeconds 
 , 
  
 recordingFrequency 
 ); 
  
 int 
  
 lastSamplePosition 
  
 = 
  
 0 
 ; 
  
 while 
  
 ( 
 true 
 ) 
  
 { 
  
 if 
  
 ( 
 ! 
 Microphone 
 . 
 IsRecording 
 ( 
 microphoneDeviceName 
 )) 
  
 { 
  
 yield 
  
 break 
 ; 
  
 } 
  
 int 
  
 currentSamplePosition 
  
 = 
  
 Microphone 
 . 
 GetPosition 
 ( 
 microphoneDeviceName 
 ); 
  
 if 
  
 ( 
 currentSamplePosition 
  
 != 
  
 lastSamplePosition 
 ) 
  
 { 
  
 // The Microphone uses a circular buffer, so we need to check if the 
  
 // current position wrapped around to the beginning, and handle it 
  
 // accordingly. 
  
 int 
  
 sampleCount 
 ; 
  
 if 
  
 ( 
 currentSamplePosition 
  
 > 
  
 lastSamplePosition 
 ) 
  
 { 
  
 sampleCount 
  
 = 
  
 currentSamplePosition 
  
 - 
  
 lastSamplePosition 
 ; 
  
 } 
  
 else 
  
 { 
  
 sampleCount 
  
 = 
  
 recordingClip 
 . 
 samples 
  
 - 
  
 lastSamplePosition 
  
 + 
  
 currentSamplePosition 
 ; 
  
 } 
  
 if 
  
 ( 
 sampleCount 
  
 > 
  
 0 
 ) 
  
 { 
  
 // Get the audio chunk 
  
 float 
 [] 
  
 samples 
  
 = 
  
 new 
  
 float 
 [ 
 sampleCount 
 ]; 
  
 recordingClip 
 . 
 GetData 
 ( 
 samples 
 , 
  
 lastSamplePosition 
 ); 
  
 // Send the data, discarding the resulting Task to avoid the warning 
  
 _ 
  
 = 
  
 liveSession 
 . 
 SendAudioAsync 
 ( 
 samples 
 ); 
  
 lastSamplePosition 
  
 = 
  
 currentSamplePosition 
 ; 
  
 } 
  
 } 
  
 // Wait for a short delay before reading the next sample from the Microphone 
  
 const 
  
 float 
  
 MicrophoneReadDelay 
  
 = 
  
 0.5f 
 ; 
  
 yield 
  
 return 
  
 new 
  
 WaitForSeconds 
 ( 
 MicrophoneReadDelay 
 ); 
  
 } 
 } 
 Queue 
   
 audioBuffer 
  
 = 
  
 new 
 (); 
 async 
  
 Task 
  
 ReceiveAudio 
 ( 
 LiveSession 
  
 liveSession 
 ) 
  
 { 
  
 int 
  
 sampleRate 
  
 = 
  
 24000 
 ; 
  
 int 
  
 channelCount 
  
 = 
  
 1 
 ; 
  
 // Create a looping AudioClip to fill with the received audio data 
  
 int 
  
 bufferSamples 
  
 = 
  
 ( 
 int 
 )( 
 sampleRate 
  
 * 
  
 channelCount 
 ); 
  
 AudioClip 
  
 clip 
  
 = 
  
 AudioClip 
 . 
 Create 
 ( 
 "StreamingPCM" 
 , 
  
 bufferSamples 
 , 
  
 channelCount 
 , 
  
 sampleRate 
 , 
  
 true 
 , 
  
 OnAudioRead 
 ); 
  
 // Attach the clip to an AudioSource and start playing it 
  
 AudioSource 
  
 audioSource 
  
 = 
  
 GetComponent 
  (); 
  
 audioSource 
 . 
 clip 
  
 = 
  
 clip 
 ; 
  
 audioSource 
 . 
 loop 
  
 = 
  
 true 
 ; 
  
 audioSource 
 . 
 Play 
 (); 
  
 // Start receiving the response 
  
 await 
  
 foreach 
  
 ( 
 var 
  
 message 
  
 in 
  
 liveSession 
 . 
 ReceiveAsync 
 ()) 
  
 { 
  
 // Process the received message 
  
 foreach 
  
 ( 
 float 
 [] 
  
 pcmData 
  
 in 
  
 message 
 . 
 AudioAsFloat 
 ) 
  
 { 
  
 lock 
  
 ( 
 audioBuffer 
 ) 
  
 { 
  
 foreach 
  
 ( 
 float 
  
 sample 
  
 in 
  
 pcmData 
 ) 
  
 { 
  
 audioBuffer 
 . 
 Enqueue 
 ( 
 sample 
 ); 
  
 } 
  
 } 
  
 } 
  
 } 
 } 
 // This method is called by the AudioClip to load audio data. 
 private 
  
 void 
  
 OnAudioRead 
 ( 
 float 
 [] 
  
 data 
 ) 
  
 { 
  
 int 
  
 samplesToProvide 
  
 = 
  
 data 
 . 
 Length 
 ; 
  
 int 
  
 samplesProvided 
  
 = 
  
 0 
 ; 
  
 lock 
 ( 
 audioBuffer 
 ) 
  
 { 
  
 while 
  
 ( 
 samplesProvided 
  
 < 
  
 samplesToProvide 
  
 && 
  
 audioBuffer 
 . 
 Count 
  
 > 
  
 0 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 audioBuffer 
 . 
 Dequeue 
 (); 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
  
 } 
  
 while 
  
 ( 
 samplesProvided 
  
 < 
  
 samplesToProvide 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 0.0f 
 ; 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
 } 
 
 
 

Stream text + audio input

Click your Gemini API provider to view provider-specific content and code on this page.

If needed, you can send text input along with the audio input and receive streamed audio output.

Swift

To use the Live API , create a LiveModel instance and set the response modality to audio .

  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 let 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 [. 
 audio 
 ] 
  
 ) 
 ) 
 do 
  
 { 
  
 let 
  
 session 
  
 = 
  
 try 
  
 await 
  
 liveModel 
 . 
 connect 
 () 
  
 // Provide a text prompt 
  
 let 
  
 text 
  
 = 
  
 "tell a short story" 
  
 await 
  
 session 
 . 
 sendTextRealtime 
 ( 
 text 
 ) 
  
 var 
  
 outputText 
  
 = 
  
 "" 
  
 for 
  
 try 
  
 await 
  
 message 
  
 in 
  
 session 
 . 
 responses 
  
 { 
  
 if 
  
 case 
  
 let 
  
 . 
 content 
 ( 
 content 
 ) 
  
 = 
  
 message 
 . 
 payload 
  
 { 
  
 content 
 . 
 modelTurn 
 ?. 
 parts 
 . 
 forEach 
  
 { 
  
 part 
  
 in 
  
 if 
  
 let 
  
 part 
  
 = 
  
 part 
  
 as 
 ? 
  
 InlineDataPart 
 , 
  
 part 
 . 
 mimeType 
 . 
 starts 
 ( 
 with 
 : 
  
 "audio/pcm" 
 ) 
  
 { 
  
 // Handle 16bit pcm audio data at 24khz 
  
 playAudio 
 ( 
 part 
 . 
 data 
 ) 
  
 } 
  
 } 
  
 // Optional: if you don't require to send more requests. 
  
 if 
  
 content 
 . 
 isTurnComplete 
  
 { 
  
 await 
  
 session 
 . 
 close 
 () 
  
 } 
  
 } 
  
 } 
 } 
  
 catch 
  
 { 
  
 fatalError 
 ( 
 error 
 . 
 localizedDescription 
 ) 
 } 
 

Kotlin

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 val 
  
 liveModel 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
  
 = 
  
 liveGenerationConfig 
  
 { 
  
 responseModality 
  
 = 
  
 ResponseModality 
 . 
 AUDIO 
  
 } 
 ) 
 val 
  
 session 
  
 = 
  
 liveModel 
 . 
 connect 
 () 
 // Provide a text prompt 
 val 
  
 text 
  
 = 
  
 "tell a short story" 
 session 
 . 
 send 
 ( 
 text 
 ) 
 session 
 . 
 receive 
 (). 
 collect 
  
 { 
  
 if 
 ( 
 it 
 . 
 turnComplete 
 ) 
  
 { 
  
 // Optional: if you don't require to send more requests. 
  
 session 
 . 
 stopReceiving 
 (); 
  
 } 
  
 // Handle 16bit pcm audio data at 24khz 
  
 playAudio 
 ( 
 it 
 . 
 data 
 ) 
 } 
 

Java

To use the Live API , create a LiveModel instance and set the response modality to AUDIO .

  ExecutorService 
  
 executor 
  
 = 
  
 Executors 
 . 
 newFixedThreadPool 
 ( 
 1 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `liveModel` instance with a model that supports the Live API 
 LiveGenerativeModel 
  
 lm 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 liveModel 
 ( 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with text 
  
 new 
  
 LiveGenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModality 
 ( 
 ResponseModality 
 . 
 AUDIO 
 ) 
  
 . 
 build 
 () 
 ); 
 LiveModelFutures 
  
 model 
  
 = 
  
 LiveModelFutures 
 . 
 from 
 ( 
 lm 
 ); 
 ListenableFuture<LiveSession> 
  
 sessionFuture 
  
 = 
  
 model 
 . 
 connect 
 (); 
 class 
 LiveContentResponseSubscriber 
  
 implements 
  
 Subscriber<LiveContentResponse> 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSubscribe 
 ( 
 Subscription 
  
 s 
 ) 
  
 { 
  
 s 
 . 
 request 
 ( 
 Long 
 . 
 MAX_VALUE 
 ); 
  
 // Request an unlimited number of items 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onNext 
 ( 
 LiveContentResponse 
  
 liveContentResponse 
 ) 
  
 { 
  
 // Handle 16bit pcm audio data at 24khz 
  
 liveContentResponse 
 . 
 getData 
 (); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onError 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
 "Error: " 
  
 + 
  
 t 
 . 
 getMessage 
 ()); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onComplete 
 () 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Done receiving messages!" 
 ); 
  
 } 
 } 
 Futures 
 . 
 addCallback 
 ( 
 sessionFuture 
 , 
  
 new 
  
 FutureCallback<LiveSession> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 LiveSession 
  
 ses 
 ) 
  
 { 
  
 LiveSessionFutures 
  
 session 
  
 = 
  
 LiveSessionFutures 
 . 
 from 
 ( 
 ses 
 ); 
  
 // Provide a text prompt 
  
 String 
  
 text 
  
 = 
  
 "tell me a short story?" 
 ; 
  
 session 
 . 
 send 
 ( 
 text 
 ); 
  
 Publisher<LiveContentResponse> 
  
 publisher 
  
 = 
  
 session 
 . 
 receive 
 (); 
  
 publisher 
 . 
 subscribe 
 ( 
 new 
  
 LiveContentResponseSubscriber 
 ()); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 // Handle exceptions 
  
 } 
 }, 
  
 executor 
 ); 
 

Web

To use the Live API , create a LiveGenerativeModel instance and set the response modality to AUDIO .

  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getLiveGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `LiveGenerativeModel` instance with a model that supports the Live API 
 const 
  
 liveModel 
  
 = 
  
 getLiveGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 AUDIO 
 ], 
  
 }, 
 }); 
 const 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
 // Provide a text prompt 
 const 
  
 prompt 
  
 = 
  
 "tell a short story" 
 ; 
 session 
 . 
 send 
 ( 
 prompt 
 ); 
 // Handle the model's audio output 
 const 
  
 messages 
  
 = 
  
 session 
 . 
 receive 
 (); 
 for 
  
 await 
  
 ( 
 const 
  
 message 
  
 of 
  
 messages 
 ) 
  
 { 
  
 switch 
  
 ( 
 message 
 . 
 type 
 ) 
  
 { 
  
 case 
  
 "serverContent" 
 : 
  
 if 
  
 ( 
 message 
 . 
 turnComplete 
 ) 
  
 { 
  
 // TODO(developer): Handle turn completion 
  
 } 
  
 else 
  
 if 
  
 ( 
 message 
 . 
 interrupted 
 ) 
  
 { 
  
 // TODO(developer): Handle the interruption 
  
 break 
 ; 
  
 } 
  
 else 
  
 if 
  
 ( 
 message 
 . 
 modelTurn 
 ) 
  
 { 
  
 const 
  
 parts 
  
 = 
  
 message 
 . 
 modelTurn 
 ? 
 . 
 parts 
 ; 
  
 parts 
 ? 
 . 
 forEach 
 (( 
 part 
 ) 
  
 => 
  
 { 
  
 if 
  
 ( 
 part 
 . 
 inlineData 
 ) 
  
 { 
  
 // TODO(developer): Play the audio chunk 
  
 } 
  
 }); 
  
 } 
  
 break 
 ; 
  
 case 
  
 "toolCall" 
 : 
  
 // Ignore 
  
 case 
  
 "toolCallCancellation" 
 : 
  
 // Ignore 
  
 } 
 } 
 

Dart

To use the Live API , create a LiveGenerativeModel instance and set the response modality to audio .

  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 import 
  
 'dart:async' 
 ; 
 import 
  
 'dart:typed_data' 
 ; 
 late 
  
 LiveModelSession 
  
 _session 
 ; 
 Future<Stream<Uint8List> 
>  
 textToAudio 
 ( 
 String 
  
 textPrompt 
 ) 
  
 async 
  
 { 
  
 WidgetsFlutterBinding 
 . 
 ensureInitialized 
 (); 
  
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
  
 ); 
  
 // Initialize the Gemini Developer API backend service 
  
 // Create a `liveGenerativeModel` instance with a model that supports the Live API 
  
 final 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 liveGenerativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-native-audio-preview-12-2025' 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig: 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 audio 
 ], 
  
 ), 
  
 ); 
  
 _session 
  
 = 
  
 await 
  
 liveModel 
 . 
 connect 
 (); 
  
 final 
  
 prompt 
  
 = 
  
 Content 
 . 
 text 
 ( 
 textPrompt 
 ); 
  
 await 
  
 _session 
 . 
 send 
 ( 
 input: 
  
 prompt 
 ); 
  
 return 
  
 _session 
 . 
 receive 
 (). 
 asyncMap 
 (( 
 response 
 ) 
  
 async 
  
 { 
  
 if 
  
 ( 
 response 
  
 is 
  
 LiveServerContent 
 && 
 response 
 . 
 modelTurn 
 ? 
 . 
 parts 
  
 != 
  
 null 
 ) 
  
 { 
  
 for 
  
 ( 
 final 
  
 part 
  
 in 
  
 response 
 . 
 modelTurn 
 ! 
 . 
 parts 
 ) 
  
 { 
  
 if 
  
 ( 
 part 
  
 is 
  
 InlineDataPart 
 ) 
  
 { 
  
 return 
  
 part 
 . 
 bytes 
 ; 
  
 } 
  
 } 
  
 } 
  
 throw 
  
 Exception 
 ( 
 'Audio data not found' 
 ); 
  
 }); 
 } 
 Future<void> 
  
 main 
 () 
  
 async 
  
 { 
  
 try 
  
 { 
  
 final 
  
 audioStream 
  
 = 
  
 await 
  
 textToAudio 
 ( 
 'Convert this text to audio.' 
 ); 
  
 await 
  
 for 
  
 ( 
 final 
  
 audioData 
  
 in 
  
 audioStream 
 ) 
  
 { 
  
 // Process the audio data (e.g., play it using an audio player package) 
  
 print 
 ( 
 'Received audio data: 
 ${ 
 audioData 
 . 
 length 
 } 
 bytes' 
 ); 
  
 // Example using flutter_sound (replace with your chosen package): 
  
 // await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData); 
  
 } 
  
 } 
  
 catch 
  
 ( 
 e 
 ) 
  
 { 
  
 print 
 ( 
 'Error: 
 $ 
 e 
 ' 
 ); 
  
 } 
 } 
 

Unity

To use the Live API , create a LiveModel instance and set the response modality to Audio .

  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 async 
  
 Task 
  
 SendTextReceiveAudio 
 () 
  
 { 
  
 // Initialize the Gemini Developer API backend service 
  
 // Create a `LiveModel` instance with a model that supports the Live API 
  
 var 
  
 liveModel 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetLiveModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-native-audio-preview-12-2025" 
 , 
  
 // Configure the model to respond with audio 
  
 liveGenerationConfig 
 : 
  
 new 
  
 LiveGenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Audio 
  
 }) 
  
 ); 
  
 LiveSession 
  
 session 
  
 = 
  
 await 
  
 liveModel 
 . 
 ConnectAsync 
 (); 
  
 // Provide a text prompt 
  
 var 
  
 prompt 
  
 = 
  
 ModelContent 
 . 
 Text 
 ( 
 "Convert this text to audio." 
 ); 
  
 await 
  
 session 
 . 
 SendAsync 
 ( 
 content 
 : 
  
 prompt 
 , 
  
 turnComplete 
 : 
  
 true 
 ); 
  
 // Start receiving the response 
  
 await 
  
 ReceiveAudio 
 ( 
 session 
 ); 
 } 
 Queue<float> 
  
 audioBuffer 
  
 = 
  
 new 
 (); 
 async 
  
 Task 
  
 ReceiveAudio 
 ( 
 LiveSession 
  
 session 
 ) 
  
 { 
  
 int 
  
 sampleRate 
  
 = 
  
 24000 
 ; 
  
 int 
  
 channelCount 
  
 = 
  
 1 
 ; 
  
 // Create a looping AudioClip to fill with the received audio data 
  
 int 
  
 bufferSamples 
  
 = 
  
 ( 
 int 
 )( 
 sampleRate 
  
 * 
  
 channelCount 
 ); 
  
 AudioClip 
  
 clip 
  
 = 
  
 AudioClip 
 . 
 Create 
 ( 
 "StreamingPCM" 
 , 
  
 bufferSamples 
 , 
  
 channelCount 
 , 
  
 sampleRate 
 , 
  
 true 
 , 
  
 OnAudioRead 
 ); 
  
 // Attach the clip to an AudioSource and start playing it 
  
 AudioSource 
  
 audioSource 
  
 = 
  
 GetComponent<AudioSource> 
 (); 
  
 audioSource 
 . 
 clip 
  
 = 
  
 clip 
 ; 
  
 audioSource 
 . 
 loop 
  
 = 
  
 true 
 ; 
  
 audioSource 
 . 
 Play 
 (); 
  
 // Start receiving the response 
  
 await 
  
 foreach 
  
 ( 
 var 
  
 message 
  
 in 
  
 session 
 . 
 ReceiveAsync 
 ()) 
  
 { 
  
 // Process the received message 
  
 foreach 
  
 ( 
 float 
 [] 
  
 pcmData 
  
 in 
  
 message 
 . 
 AudioAsFloat 
 ) 
  
 { 
  
 lock 
  
 ( 
 audioBuffer 
 ) 
  
 { 
  
 foreach 
  
 ( 
 float 
  
 sample 
  
 in 
  
 pcmData 
 ) 
  
 { 
  
 audioBuffer 
 . 
 Enqueue 
 ( 
 sample 
 ); 
  
 } 
  
 } 
  
 } 
  
 } 
 } 
 // This method is called by the AudioClip to load audio data. 
 private 
  
 void 
  
 OnAudioRead 
 ( 
 float 
 [] 
  
 data 
 ) 
  
 { 
  
 int 
  
 samplesToProvide 
  
 = 
  
 data 
 . 
 Length 
 ; 
  
 int 
  
 samplesProvided 
  
 = 
  
 0 
 ; 
  
 lock 
 ( 
 audioBuffer 
 ) 
  
 { 
  
 while 
  
 ( 
 samplesProvided 
 < 
 samplesToProvide 
 && 
 audioBuffer 
 . 
 Count 
 > 
 0 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 audioBuffer 
 . 
 Dequeue 
 (); 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
  
 } 
  
 while 
  
 ( 
 samplesProvided 
 < 
 samplesToProvide 
 ) 
  
 { 
  
 data 
 [ 
 samplesProvided 
 ] 
  
 = 
  
 0.0f 
 ; 
  
 samplesProvided 
 ++ 
 ; 
  
 } 
 } 
 

Note that you can also send text as incremental content updates during an active session.

Stream video + audio input

Providing input video content provides visual context for the input audio.

The Live API expects a sequence of discrete image frames and supports video frames input at 1 frame per second (FPS).

  • Recommended input: native 768x768 resolution at 1 FPS.

  • Supported MIME types: video/x-flv , video/quicktime , video/mpeg , video/mpegs , video/mpg , video/mp4 , video/webm , video/wmv , video/3gpp

Streaming video + audio input is a more advanced implementation, so check out a sample app to learn how to implement this capability: Swift- coming soon! | Android- sample app | Web- coming soon! | Flutter- sample app | Unity- coming soon!



Not supported features

  • Features not yet supported by Firebase AI Logic when using the Live API , but they're coming soon!

    • Handling interruptions

    • Some aspects of session management, including resuming a session across multiple connections, extending the session length, or compressing the context window. Note, though, that going away notifications are supported.

    • Disabling and configuring voice activity detection (VAD)

    • Setting input media resolution

    • Adding a thinking configuration

    • Enabling affective dialogue or proactive audio

    • Receiving UsageMetadata in the response

  • Features not supported by Firebase AI Logic when using the Live API , and they're unplanned right now.

    • Server prompt templates

    • Hybrid or on-device inference

    • AI monitoring in the Firebase console



What else can you do?

  • Customize your implementation by using various configuration options , like adding transcription or setting the response voice.

  • Learn about managing sessions , including updating content mid-session and detecting when a session is about to end.

  • Supercharge your implementation by giving the model access to tools, like function calling and grounding with Google Search. Official documentation for using tools with the Live API is coming soon!

  • Learn about limits and specifications , for using the Live API , like session length, rate limits, supported languages, etc.

Create a Mobile Website
View Site in Mobile | Classic
Share by: