The Gemini Live API processes continuous streams of audio or text called sessions . You can manage the session lifecycle, from the initial handshake to graceful termination.
Limits for sessions
For the Live API , a session refers to a persistent connection where input and output are streamed continuously over a connection.
If the session exceeds any of the following limits, the connection is terminated. Note, though, the Live API provides some options (see below) to handle these session-related limits.
-
Session context windowis limited to 128k tokens.
Due to this context window limit, here are the approximate maximum session lengths based on input modalities:
- Audio-only input sessions are limited to
15 minutes . - Video + audio input are limited to
2 minutes .
- Audio-only input sessions are limited to
-
Connection lengthis limited to about
10 minutes .You'll receive a going away notification about
60 seconds before the connection ends.
Here are some options for handling session-related limits:
-
Compress the session context window so that the server automatically maintains the context size within the limit.
-
Resume a session to prevent losing conversation context during brief network disconnects or after receiving a going away notification.
Start a session
Visit the getting started guide for the Live API for a full snippet showing how to start a session.
Update mid-session
The Live API models support the following advanced capabilities for mid-session updates :
-
Update system instructions (for Vertex AI Gemini API only)
Add incremental content updates
You can add incremental updates during an active session. Use this to send text input, establish session context, or restore session context.
-
For longer contexts, we recommend providing a single message summary to free up the context window for subsequent interactions.
-
For short contexts, you can send turn-by-turn interactions to represent the exact sequence of events, like the snippet below.
Swift
// Define initial turns (history/context).
let
turns
:
[
ModelContent
]
=
[
ModelContent
(
role
:
"user"
,
parts
:
[
TextPart
(
"What is the capital of France?"
)]),
ModelContent
(
role
:
"model"
,
parts
:
[
TextPart
(
"Paris"
)]),
]
// Send history, keeping the conversational turn OPEN (false).
await
session
.
sendContent
(
turns
,
turnComplete
:
false
)
// Define the new user query.
let
newTurn
:
[
ModelContent
]
=
[
ModelContent
(
role
:
"user"
,
parts
:
[
TextPart
(
"What is the capital of Germany?"
)]),
]
// Send the final query, CLOSING the turn (true) to trigger the model response.
await
session
.
sendContent
(
newTurn
,
turnComplete
:
true
)
Kotlin
Not
yet
supported
for
Android
apps
-
check
back
soon
!
Java
Not
yet
supported
for
Android
apps
-
check
back
soon
!
Web
const
turns
=
[{
text
:
"Hello from the user!"
}];
await
session
.
send
(
turns
,
false
// turnComplete: false
);
console
.
log
(
"Sent history. Waiting for next input..."
);
// Define the new user query.
const
newTurn
[{
text
:
"And what is the capital of Germany?"
}];
// Send the final query, CLOSING the turn (true) to trigger the model response.
await
session
.
send
(
newTurn
,
true
// turnComplete: true
);
console
.
log
(
"Sent final query. Model response expected now."
);
Dart
// Define initial turns (history/context).
final
List
turns
=
[
Content
(
"user"
,
[
Part
.
text
(
"What is the capital of France?"
)],
),
Content
(
"model"
,
[
Part
.
text
(
"Paris"
)],
),
];
// Send history, keeping the conversational turn OPEN (false).
await
session
.
send
(
input:
turns
,
turnComplete:
false
,
);
// Define the new user query.
final
List
newTurn
=
[
Content
(
"user"
,
[
Part
.
text
(
"What is the capital of Germany?"
)],
),
];
// Send the final query, CLOSING the turn (true) to trigger the model response.
await
session
.
send
(
input:
newTurn
,
turnComplete:
true
,
);
Unity
// Define initial turns (history/context).
List
turns
=
new
List
{
new
ModelContent
(
"user"
,
new
ModelContent
.
TextPart
(
"What is the capital of France?"
)
),
new
ModelContent
(
"model"
,
new
ModelContent
.
TextPart
(
"Paris"
)
),
};
// Send history, keeping the conversational turn OPEN (false).
foreach
(
ModelContent
turn
in
turns
)
{
await
session
.
SendAsync
(
content
:
turn
,
turnComplete
:
false
);
}
// Define the new user query.
ModelContent
newTurn
=
ModelContent
.
Text
(
"What is the capital of Germany?"
);
// Send the final query, CLOSING the turn (true) to trigger the model response.
await
session
.
SendAsync
(
content
:
newTurn
,
turnComplete
:
true
);
Update system instructions mid-session
You can update the system instructions during an active session. Use this to adapt the model's responses, for example to change the response language or modify the tone.
To update the system instructions mid-session, you can send text content with
the system
role. The updated system instructions will remain in effect for the
remainder of the session.
Swift
await
session
.
sendContent
(
[
ModelContent
(
role
:
"system"
,
parts
:
[
TextPart
(
"new system instruction"
)]
)],
turnComplete
:
false
)
Kotlin
Not
yet
supported
for
Android
apps
-
check
back
soon
!
Java
Not
yet
supported
for
Android
apps
-
check
back
soon
!
Web
Not
yet
supported
for
Web
apps
-
check
back
soon
!
Dart
try
{
await
_session
.
send
(
input:
Content
(
'system'
,
[
Part
.
text
(
'new system instruction'
)],
),
turnComplete:
false
,
);
}
catch
(
e
)
{
print
(
'Failed to update system instructions:
$
e
'
);
}
Unity
try
{
await
session
.
SendAsync
(
content
:
new
ModelContent
(
"system"
,
new
ModelContent
.
TextPart
(
"new system instruction"
)
),
turnComplete
:
false
);
}
catch
(
Exception
e
)
{
Debug
.
LogError
(
$"Failed to update system instructions: {e.Message}"
);
}
Compress the context window
Click your Gemini API provider to view provider-specific content and code on this page.
The Live API session context window stores real-time streamed data (25 tokens per second (TPS) for audio and 258 TPS for video) as well as other content, including text inputs and model outputs. All Live API models have a session context window limit of 128k tokens.
By default, due to this context window limit, here are the approximate maximum session lengths based on input modalities:
- Audio-only input sessions are limited to
15 minutes . - Video + audio input are limited to
2 minutes .
In long-running sessions, as the conversation progresses, the history of audio and/or video tokens accumulates. If this history exceeds the model's limit, the model may hallucinate, slow down, or the session may be forcibly terminated.
To enable longer sessions, you can enable context window compression
by
setting the contextWindowCompression
field as part of the LiveGenerationConfig
. When enabled, the server uses a sliding-window
mechanism to automatically discard the oldest turns or summarize them to
maintain the context size within the default or specified limits. System
instructions are not discarded and will always stay at the beginning of context
window.
From the user's perspective, this allows for theoretically infinite session durations since the "memory" is constantly managed.
You can configure the sliding-window mechanism as well as optionally the number of tokens that triggers compression: (see available settings and values below). Here are some high-level considerations about using these settings:
-
Setting
targetTokensvery low will free up more context room for continuous streams, but the model will rapidly "forget" older turns of the conversation. -
Setting
targetTokenscloser totriggerTokenspreserves more memory but will trigger compression routines far more frequently.
triggerTokens
the context length before compression is triggered
targetTokens
the target number of tokens to keep
triggerTokens
value - If
triggerTokensis not explicitly set, thentargetTokensdefaults to 50% of the defaulttriggerTokensvalue. - The
targetTokensvalue must be less than thetriggerTokensvalue.
Swift
// Initialize the Gemini Developer API backend service
let
liveModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
liveModel
(
modelName
:
"gemini-2.5-flash-native-audio-preview-12-2025"
,
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
generationConfig
:
LiveGenerationConfig
(
responseModalities
:
[.
audio
],
contextWindowCompression
:
ContextWindowCompressionConfig
(
triggerTokens
:
10000
,
slidingWindow
:
SlidingWindow
(
targetTokens
:
2000
,
)
)
)
)
Kotlin
// Initialize the Gemini Developer API backend service
val
liveModel
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
liveModel
(
modelName
=
"gemini-2.5-flash-native-audio-preview-12-2025"
,
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
generationConfig
=
liveGenerationConfig
{
responseModality
=
ResponseModality
.
AUDIO
,
contextWindowCompression
=
ContextWindowCompressionConfig
(
triggerTokens
=
10000
,
slidingWindow
=
SlidingWindow
(
targetTokens
=
2000
)
)
}
)
Java
// Initialize the Gemini Developer API backend service
LiveGenerativeModel
lm
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
liveModel
(
"gemini-2.5-flash-native-audio-preview-12-2025"
,
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
new
LiveGenerationConfig
.
Builder
()
.
setResponseModality
(
ResponseModality
.
AUDIO
)
.
setContextWindowCompression
(
new
ContextWindowCompressionConfig
(
10000
,
new
SlidingWindow
(
2000
))
)
.
build
()
);
Web
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
const
liveModel
=
getLiveGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-native-audio-preview-12-2025"
,
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
AUDIO
],
contextWindowCompression
:
{
triggerTokens
:
10000
,
slidingWindow
:
{
targetTokens
:
2000
,
},
},
},
});
Dart
final
_liveModel
=
FirebaseAI
.
googleAI
().
liveGenerativeModel
(
model:
'gemini-2.5-flash-native-audio-preview-12-2025'
,
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
liveGenerationConfig:
LiveGenerationConfig
(
responseModalities:
[
ResponseModalities
.
audio
],
contextWindowCompression:
ContextWindowCompressionConfig
(
triggerTokens:
10000
,
slidingWindow:
SlidingWindow
(
targetTokens:
2000
),
),
),
);
Unity
var
liveModel
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetLiveModel
(
modelName
:
"gemini-2.5-flash-native-audio-preview-12-2025"
,
// Enable context window compression.
// (Optional) Configure the number of tokens in the context window that triggers the compression.
liveGenerationConfig
:
new
LiveGenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Audio
},
contextWindowCompression
:
new
ContextWindowCompressionConfig
(
triggerTokens
:
10000
,
slidingWindow
:
new
SlidingWindow
(
targetTokens
:
2000
)
)
)
);
Detect when a session is going to end
The maximum duration of a single, continuous WebSocket connection is about
The following example shows how to detect an impending connection termination by listening for a going away notification:
Swift
for
try
await
response
in
session
.
responses
{
switch
response
.
payload
{
case
.
goingAwayNotice
(
let
goingAwayNotice
):
// Prepare for the session to close soon
if
let
timeLeft
=
goingAwayNotice
.
timeLeft
{
print
(
"Server going away in
\(
timeLeft
)
seconds"
)
}
}
}
Kotlin
for
(
response
in
session
.
responses
)
{
when
(
val
message
=
response
.
payload
)
{
is
LiveServerGoAway
->
{
// Prepare for the session to close soon
val
remaining
=
message
.
timeLeft
logger
.
info
(
"Server going away in
$
remaining
"
)
}
}
}
Java
session
.
getResponses
().
forEach
(
response
->
{
if
(
response
.
getPayload
()
instanceof
LiveServerResponse
.
GoingAwayNotice
)
{
LiveServerResponse
.
GoingAwayNotice
notice
=
(
LiveServerResponse
.
GoingAwayNotice
)
response
.
getPayload
();
// Prepare for the session to close soon
Duration
timeLeft
=
notice
.
getTimeLeft
();
}
});
Web
for
await
(
const
message
of
session
.
receive
())
{
switch
(
message
.
type
)
{
...
case
"goingAwayNotice"
:
console
.
log
(
"Server going away. Time left:"
,
message
.
timeLeft
);
break
;
}
}
Dart
Future
_handleLiveServerMessage
(
LiveServerResponse
response
)
async
{
final
message
=
response
.
message
;
if
(
message
is
GoingAwayNotice
)
{
// Prepare for the session to close soon
developer
.
log
(
'Server going away. Time left:
${
message
.
timeLeft
}
'
);
}
}
Unity
foreach
(
var
response
in
session
.
Responses
)
{
if
(
response
.
Payload
is
LiveSessionGoingAway
notice
)
{
// Prepare for the session to close soon
TimeSpan
timeLeft
=
notice
.
TimeLeft
;
Debug
.
Log
(
$"Server going away notice received. Remaining: {timeLeft}"
);
}
}
Resume a session
The Live API supports session resumption to prevent losing conversation context. Every session has a handle, and it can be used in the following ways:
-
Maintaining a session before reaching the connection time limit
The maximum duration of a single, continuous WebSocket connection is about
10 minutes . You can detect when a connection is about to end by listening for a going away notification , and then extending the session by establishing a new connection using the session handle. -
Resuming a session just after a connection drop
If a connection terminates or drops before the maximum connection time limit (for example, switching from WiFi to 5G), the server keeps the session state for about
10 minutes . During this window, you can resume the session by establishing a new connection using the session handle. -
Resuming a session after an extended time period
After a connection ends, the server keeps the session state for a few hours. During this window, you can resume the session by establishing a new connection using the session handle. Note that this window is different for the two Gemini API providers: Gemini Developer API is
2 hours | Vertex AI Gemini API is24 hours .
By default, session resumption is disabled. To enable session resumption, pass an empty resumption configuration when establishing a new connection. When enabled, the server periodically sends updates containing a session resumption handle. If the session is disconnected, you can reconnect and pass this handle to resume the session with its context intact.
The following examples show two options for resuming the session:
Swift
// Local variable to save the active session handle
var
activeSessionHandle
:
String
?
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var
session
=
try
await
liveModel
.
connect
(
sessionResumption
:
SessionResumptionConfig
()
)
// Start receiving responses
for
try
await
message
in
session
.
responses
{
// Check for new session handles inside your message handling loop
switch
message
.
payload
{
case
let
.
sessionResumptionUpdate
(
updateMessage
):
guard
let
newHandle
=
updateMessage
.
newHandle
,
updateMessage
.
resumable
else
{
continue
}
activeSessionHandle
=
newHandle
print
(
"SessionResumptionUpdate: handle
\(
newHandle
)
"
)
// ... handle other LiveServerMessage types ...
default
:
break
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if
let
handle
=
activeSessionHandle
{
session
=
try
await
liveModel
.
connect
(
sessionResumption
:
SessionResumptionConfig
(
handle
:
handle
)
)
}
// Option 2: Resume the session directly on an existing session object
if
let
handle
=
activeSessionHandle
{
try
await
session
.
resumeSession
(
sessionResumption
:
SessionResumptionConfig
(
handle
:
handle
)
)
}
Kotlin
// Local variable to save the active session handle
var
activeSessionHandle
:
String?
=
null
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var
session
=
liveModel
.
connect
(
sessionResumption
=
SessionResumptionConfig
()
)
// Start receiving responses
session
.
receive
().
collect
{
message
->
// Process other received response types...
// Check for new session handles inside your message handling loop
if
(
message
is
LiveSessionResumptionUpdate
)
{
if
(
message
.
resumable
==
true
&&
message
.
newHandle
!=
null
)
{
activeSessionHandle
=
message
.
newHandle
Log
.
d
(
"TAG"
,
"SessionResumptionUpdate: handle
${
message
.
newHandle
}
"
)
}
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
activeSessionHandle
?.
let
{
handle
->
session
=
liveModel
.
connect
(
sessionResumption
=
SessionResumptionConfig
(
handle
=
handle
)
)
}
// Option 2: Resume the session directly on an existing session object
activeSessionHandle
?.
let
{
handle
->
session
.
resumeSession
(
sessionResumption
=
SessionResumptionConfig
(
handle
=
handle
)
)
}
Java
For
Java
,
session
resumption
is
not
yet
supported
.
Check
back
soon
!
Web
// Local variable to save the active session handle
let
activeSessionHandle
=
null
;
// Initialize the session. Passing an empty object requests the server to send SessionResumptionUpdate
let
session
=
await
liveModel
.
connect
({});
// Start receiving responses
for
await
(
const
message
of
session
.
receive
())
{
// Process other received response types...
// Check for new session handles inside your message handling loop
if
(
message
.
type
===
'sessionResumptionUpdate'
)
{
if
(
message
.
resumable
&&
message
.
newHandle
)
{
activeSessionHandle
=
message
.
newHandle
;
console
.
log
(
`SessionResumptionUpdate: handle
${
activeSessionHandle
}
`
);
}
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if
(
activeSessionHandle
)
{
session
=
await
liveModel
.
connect
({
handle
:
activeSessionHandle
});
}
// Option 2: Resume the session directly on an existing session object
if
(
activeSessionHandle
)
{
await
session
.
resumeSession
({
handle
:
activeSessionHandle
});
}
Dart
// Local variable to save the active session handle
String
?
_activeSessionHandle
;
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var
_session
=
await
_liveModel
.
connect
(
sessionResumption:
SessionResumptionConfig
(),
);
// Start receiving responses
await
for
(
final
message
in
_session
.
receive
())
{
// Process other received response types...
// Check for new session handles inside your message handling loop
if
(
message
is
SessionResumptionUpdate
&&
message
.
resumable
!=
null
&&
message
.
resumable
!
)
{
_activeSessionHandle
=
message
.
newHandle
;
log
(
'SessionResumptionUpdate: handle
${
message
.
newHandle
}
'
);
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if
(
_activeSessionHandle
!=
null
)
{
_session
=
await
_liveModel
.
connect
(
sessionResumption:
SessionResumptionConfig
.
resume
(
_activeSessionHandle
!
),
);
}
// Option 2: Alternatively, resume the session directly on an existing session object
if
(
_activeSessionHandle
!=
null
)
{
await
_session
.
resumeSession
(
sessionResumption:
SessionResumptionConfig
.
resume
(
_activeSessionHandle
!
),
);
}
Unity
// Local variable to save the active session handle
string
activeSessionHandle
=
null
;
// Initialize the session. Passing an empty config requests the server to send SessionResumptionUpdate
var
session
=
await
liveModel
.
ConnectAsync
(
sessionResumption
:
new
SessionResumptionConfig
()
);
// Start receiving responses
await
foreach
(
var
response
in
session
.
ReceiveAsync
())
{
// Process other received response types...
// Check for new session handles inside your message handling loop
if
(
response
.
Message
is
LiveSessionResumptionUpdate
updateMessage
)
{
if
(
updateMessage
.
Resumable
==
true
&&
!
string
.
IsNullOrEmpty
(
updateMessage
.
NewHandle
))
{
activeSessionHandle
=
updateMessage
.
NewHandle
;
Debug
.
Log
(
$"SessionResumptionUpdate: handle {activeSessionHandle}"
);
}
}
}
// The following are alternative options to resume a session. Choose only one.
// Option 1: Create and connect a session to resume with the saved handle
if
(
!
string
.
IsNullOrEmpty
(
activeSessionHandle
))
{
session
=
await
liveModel
.
ConnectAsync
(
sessionResumption
:
new
SessionResumptionConfig
(
activeSessionHandle
)
);
}
// Option 2: Resume the session directly on an existing session object
if
(
!
string
.
IsNullOrEmpty
(
activeSessionHandle
))
{
await
session
.
ResumeSessionAsync
(
sessionResumption
:
new
SessionResumptionConfig
(
activeSessionHandle
)
);
}

