This guide shows you how to send various types of requests to the Gemini API through Apple's Foundation Models framework using the Firebase AI Logic SDK for Apple platforms.
This page shows examples for how to send the following types of requests:
- Generate text from text-only input
- Generate text during a multi-turn session (chat)
- Generate text from multimodal input (like images)
- Generate images from text-only input
Generate text
Gemini models support the following capabilities for generating text:
- Generate text from text-only input
- Generate text during a multi-turn session (chat)
- Generate text from multimodal input (like images)
Models that support this capability
-
gemini-3.1-pro-preview -
gemini-3.5-flash -
gemini-3.1-flash-lite
Generate text from text-only input
Click your Gemini API provider to view provider-specific content and code on this page.
You can ask a Gemini model to generate text by prompting with text-only input.
import
FoundationModels
import
FirebaseCore
import
FirebaseAILogic
// Initialize the Gemini Developer API backend service.
let
ai
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
())
// Initialize a `geminiLanguageModel` with a Gemini model that supports your use case.
let
model
=
ai
.
geminiLanguageModel
(
name
:
"gemini-3.5-flash"
)
// Provide a prompt that contains text.
let
prompt
=
"Write a story about a magic backpack."
// Create a session by injecting the model into Apple's `LanguageModelSession`.
// For a single-turn interaction, create a new session each time you call the model.
let
session
=
LanguageModelSession
(
model
:
model
)
// Generate a text response to the prompt.
let
response
=
try
await
session
.
respond
(
to
:
prompt
)
print
(
response
.
content
)
Stream the response
You can achieve faster interactions by not waiting for the entire result from
the model generation, and instead use streaming
to handle partial results. To
stream the response, use streamResponse(to:)
instead of respond(to:)
.
// imports
// initialization of Gemini API backend service and a `geminiLanguageModel`
// Provide a prompt that contains text.
let
prompt
=
"Write a story about a magic backpack."
// Create a session by injecting the model into Apple's `LanguageModelSession`.
// For a single-turn interaction, create a new session each time you call the model.
let
session
=
LanguageModelSession
(
model
:
model
)
// Generate a text response to the prompt.
// To stream the response, use `streamResponse(to:)` instead of `respond(to:)`
let
stream
=
session
.
streamResponse
(
to
:
"Write a story about a magic backpack."
)
var
response
=
""
for
try
await
snapshot
in
stream
{
// The snapshot contains *all* content generated so far.
response
=
snapshot
.
content
}
Generate text during a multi-turn session (chat)
Click your Gemini API provider to view provider-specific content and code on this page.
import
FoundationModels
import
FirebaseCore
import
FirebaseAILogic
// Initialize the Gemini Developer API backend service.
let
ai
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
())
// Initialize a `geminiLanguageModel` with a Gemini model that supports your use case.
let
model
=
ai
.
geminiLanguageModel
(
name
:
"gemini-3.5-flash"
)
// Create a session by injecting the model into Apple's `LanguageModelSession`.
// The session maintains state between each request.
let
session
=
LanguageModelSession
(
model
:
model
)
// Generate a text response to an initial prompt.
let
response
=
try
await
session
.
respond
(
to
:
"Hello! I'd like to learn more about Albert Einstein."
)
print
(
response
.
content
)
// Example response from model: "What would you like to know?"
// Continue using the existing session. Each prompt and response is added to the transcript.
let
response2
=
try
await
session
.
respond
(
to
:
"When was he born?"
)
print
(
response2
.
content
)
// Example response from model: "March 14, 1879"
Generate text from multimodal input (like images)
Click your Gemini API provider to view provider-specific content and code on this page.
You can ask a Gemini model to generate text by prompting with text and a file, like an image or PDF.
import
FoundationModels
import
FirebaseCore
import
FirebaseAILogic
// Initialize the Gemini Developer API backend service.
let
ai
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
())
// Initialize a `geminiLanguageModel` with a Gemini model that supports your use case.
let
model
=
ai
.
geminiLanguageModel
(
name
:
"gemini-3.5-flash"
)
// Create a session by injecting the model into Apple's `LanguageModelSession`.
// For a single-turn interaction, create a new session each time you call the model.
let
session
=
LanguageModelSession
(
model
:
model
)
let
cgImage
:
CGImage
=
// ... fetch CGImage from your datasource.
let
response
=
try
await
session
.
respond
{
"What are the dominant colors of this image, in order?"
Attachment
(
cgImage
)
}
print
(
response
.
content
)
Stream the response
You can achieve faster interactions by not waiting for the entire result from
the model generation, and instead use streaming
to handle partial results. To
stream the response, use streamResponse
instead of respond
.
// imports
// initialization of Gemini API backend service and a `geminiLanguageModel`
// Create a session by injecting the model into Apple's `LanguageModelSession`.
// For a single-turn interaction, create a new session each time you call the model.
let
session
=
LanguageModelSession
(
model
:
model
)
let
cgImage
:
CGImage
=
// ... fetch CGImage from your datasource.
let
stream
=
session
.
streamResponse
{
"What are the dominant colors of this image, in order?"
Attachment
(
cgImage
)
}
var
response
=
""
for
try
await
snapshot
in
stream
{
// The snapshot contains *all* content generated so far.
response
=
snapshot
.
content
}
print
(
response
)
Generate images (using "Nano Banana" models)
Click your Gemini API provider to view provider-specific content and code on this page.
Models that support this capability
-
gemini-3-pro-image-preview(aka "Nano Banana Pro") -
gemini-3.1-flash-image-preview(aka "Nano Banana 2") -
gemini-2.5-flash-image(aka "Nano Banana")
You can ask a Gemini image-generating model (like a "Nano Banana" model) to generate an image by prompting with text-only input.
The following example shows how to generate only an image, but Gemini image-generating models can generate both images and text.
import
FoundationModels
import
FirebaseCore
import
FirebaseAILogic
// Initialize the Gemini Developer API backend service.
let
ai
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
())
// Initialize a `geminiLanguageModel` with a Gemini image-generating model that supports your use case.
let
model
=
ai
.
geminiLanguageModel
(
name
:
"gemini-2.5-flash-image"
options
:
GeminiGenerationOptions
(
responseModalities
:
.
image
)
)
let
session
=
LanguageModelSession
(
model
:
model
)
let
response
=
try
await
session
.
respond
(
to
:
"Generate an image of the Eiffel tower with fireworks in the background."
)
var
generatedImage
:
CIImage
?
// Find the image in the transcriptEntries.
for
entry
in
response
.
transcriptEntries
{
if
case
let
.
response
(
response
)
=
entry
{
for
segment
in
response
.
segments
{
if
case
let
.
attachment
(
attachment
)
=
segment
,
case
let
.
image
(
image
)
=
attachment
.
content
{
generatedImage
=
image
.
ciImage
}
}
}
}
Generate structured JSON output
Click your Gemini API provider to view provider-specific content and code on this page.
Models that support this capability
-
gemini-3.1-pro-preview -
gemini-3.5-flash -
gemini-3.1-flash-lite
Gemini models return responses as unstructured text by default. However, some use cases require structured text, like JSON. For example, you might be using the response for other downstream tasks that require an established data schema.
You can configure the model to format its response according to a JSON schema that you supply. For details, best practices, and use cases for generating structured JSON output, see the general Generate structured output guide.
import
FoundationModels
import
FirebaseCore
import
FirebaseAILogic
@
Generable
(
description
:
"Basic profile information about a cat"
)
struct
CatProfile
{
var
name
:
String
@
Guide
(
description
:
"The age of the cat"
,
.
range
(
0
...
20
))
var
age
:
Int
@
Guide
(
description
:
"A one sentence profile about the cat's personality"
)
var
profile
:
String
}
// Initialize the Gemini Developer API backend service.
let
ai
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
())
// Initialize a `geminiLanguageModel` with a Gemini model that supports your use case.
let
model
=
ai
.
geminiLanguageModel
(
name
:
"gemini-3.5-flash"
)
let
session
=
LanguageModelSession
(
model
:
model
)
let
response
=
try
await
session
.
respond
(
to
:
"Generate a cute rescue cat profile with an Elvish theme"
,
generating
:
CatProfile
.
self
)
let
cat
=
response
.
content
Give feedbackabout accessing the Gemini API through Apple's Foundation Models framework

