You can ask a Gemini model to generate images and edit images using both text-only and text-and-image prompts. When you use Firebase AI Logic , you can make this request directly from your app.
With this capability, you can do things like:
-
Iteratively generate images through conversation with natural language, adjusting images while maintaining consistency and context.
-
Generate images with high-quality text rendering, including long strings of text.
-
Generate interleaved text-image output. For example, a blog post with text and images in a single turn. Previously, this required stringing together multiple models.
-
Generate images using Gemini's world knowledge and reasoning capabilities.
You can find a complete list of supported modalities and capabilities (along with example prompts) later on this page.
Jump to code for text-to-image Jump to code for interleaved text & images
Jump to code for image editing Jump to code for iterative image editing
Analyze images Analyze images on-device Generate structured output
Choosing between Gemini and Imagen models
The Firebase AI Logic SDKs support image generation using either a Gemini model or an Imagen model. For most use cases, start with Gemini , and then choose Imagen for specialized tasks where image quality is critical.
Note that the Firebase AI Logic SDKs do not yet support image input (like for editing) with Imagen models. So, if you want to work with input images, you can use a Gemini model instead.
Choose Gemini when you want:
- To use world knowledge and reasoning to generate contextually relevant images.
- To seamlessly blend text and images or to interleave text and image output.
- To embed accurate visuals within long text sequences.
- To edit images conversationally while maintaining context.
Choose Imagen when you want:
- To prioritize image quality, photorealism, artistic detail, or specific styles (for example, impressionism or anime).
- To explicitly specify the aspect ratio or format of generated images.
Before you begin
Click your Gemini API provider to view provider-specific content and code on this page.
If you haven't already, complete the getting started guide
, which describes how to
set up your Firebase project, connect your app to Firebase, add the SDK,
initialize the backend service for your chosen Gemini API
provider, and
create a GenerativeModel
instance.
For testing and iterating on your prompts and even getting a generated code snippet, we recommend using Google AI Studio .
Models that support this capability
-
gemini-2.5-flash-image-preview
(aka "nano banana") -
gemini-2.0-flash-preview-image-generation
Take note that the segment order is different between the 2.0 model name and the
2.5 model name. Also, image-output from Gemini
is not
supported by
the standard Flash models like gemini-2.5-flash
or gemini-2.0-flash
.
Note that the SDKs also support image generation using Imagen models .
Generate and edit images
You can generate and edit images using a Gemini model.
Generate images (text-only input)
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to generate images by prompting with text.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
generateContent
.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Provide a text prompt instructing the model to generate an image
let
prompt
=
"Generate an image of the Eiffel tower with fireworks in the background."
// To generate an image, call `generateContent` with the text input
let
response
=
try
await
model
.
generateContent
(
prompt
)
// Handle the generated image
guard
let
inlineDataPart
=
response
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide a text prompt instructing the model to generate an image
val
prompt
=
"Generate an image of the Eiffel tower with fireworks in the background."
// To generate image output, call `generateContent` with the text input
val
generatedImageAsBitmap
=
model
.
generateContent
(
prompt
)
// Handle the generated image
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide a text prompt instructing the model to generate an image
Content
prompt
=
new
Content
.
Builder
()
.
addText
(
"Generate an image of the Eiffel Tower with fireworks in the background."
)
.
build
();
// To generate an image, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse>
response
=
model
.
generateContent
(
prompt
);
Futures
.
addCallback
(
response
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
// iterate over all the parts in the first candidate in the result object
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
// The returned image as a bitmap
Bitmap
generatedImageAsBitmap
=
imagePart
.
getImage
();
break
;
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
t
.
printStackTrace
();
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Provide a text prompt instructing the model to generate an image
const
prompt
=
'Generate an image of the Eiffel Tower with fireworks in the background.'
;
// To generate an image, call `generateContent` with the text input
const
result
=
model
.
generateContent
(
prompt
);
// Handle the generated image
try
{
const
inlineDataParts
=
result
.
response
.
inlineDataParts
();
if
(
inlineDataParts
?
.[
0
])
{
const
image
=
inlineDataParts
[
0
].
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Provide a text prompt instructing the model to generate an image
final
prompt
=
[
Content
.
text
(
'Generate an image of the Eiffel Tower with fireworks in the background.'
)];
// To generate an image, call `generateContent` with the text input
final
response
=
await
model
.
generateContent
(
prompt
);
if
(
response
.
inlineDataParts
.
isNotEmpty
)
{
final
imageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Provide a text prompt instructing the model to generate an image
var
prompt
=
"Generate an image of the Eiffel Tower with fireworks in the background."
;
// To generate an image, call `GenerateContentAsync` with the text input
var
response
=
await
model
.
GenerateContentAsync
(
prompt
);
var
text
=
response
.
Text
;
if
(
!
string
.
IsNullOrWhiteSpace
(
text
))
{
// Do something with the text
}
// Handle the generated image
var
imageParts
=
response
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
foreach
(
var
imagePart
in
imageParts
)
{
// Load the Image into a Unity Texture2D object
UnityEngine
.
Texture2D
texture2D
=
new
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
imagePart
.
Data
.
ToArray
()))
{
// Do something with the image
}
}
Generate interleaved images and text
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to generate interleaved images with its text responses. For example, you can generate images of what each step of a generated recipe might look like along with the step's instructions, and you don't have to make separate requests to the model or different models.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
generateContent
.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Provide a text prompt instructing the model to generate interleaved text and images
let
prompt
=
"""
Generate an illustrated recipe for a paella.
Create images to go alongside the text as you generate the recipe
"""
// To generate interleaved text and images, call `generateContent` with the text input
let
response
=
try
await
model
.
generateContent
(
prompt
)
// Handle the generated text and image
guard
let
candidate
=
response
.
candidates
.
first
else
{
fatalError
(
"No candidates in response."
)
}
for
part
in
candidate
.
content
.
parts
{
switch
part
{
case
let
textPart
as
TextPart
:
// Do something with the generated text
let
text
=
textPart
.
text
case
let
inlineDataPart
as
InlineDataPart
:
// Do something with the generated image
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
default
:
fatalError
(
"Unsupported part type:
\(
part
)
"
)
}
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide a text prompt instructing the model to generate interleaved text and images
val
prompt
=
"""
Generate an illustrated recipe for a paella.
Create images to go alongside the text as you generate the recipe
"""
.
trimIndent
()
// To generate interleaved text and images, call `generateContent` with the text input
val
responseContent
=
model
.
generateContent
(
prompt
).
candidates
.
first
().
content
// The response will contain image and text parts interleaved
for
(
part
in
responseContent
.
parts
)
{
when
(
part
)
{
is
ImagePart
->
{
// ImagePart as a bitmap
val
generatedImageAsBitmap
:
Bitmap?
=
part
.
asImageOrNull
()
}
is
TextPart
->
{
// Text content from the TextPart
val
text
=
part
.
text
}
}
}
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide a text prompt instructing the model to generate interleaved text and images
Content
prompt
=
new
Content
.
Builder
()
.
addText
(
"Generate an illustrated recipe for a paella.\n"
+
"Create images to go alongside the text as you generate the recipe"
)
.
build
();
// To generate interleaved text and images, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse>
response
=
model
.
generateContent
(
prompt
);
Futures
.
addCallback
(
response
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
Content
responseContent
=
result
.
getCandidates
().
get
(
0
).
getContent
();
// The response will contain image and text parts interleaved
for
(
Part
part
:
responseContent
.
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
// ImagePart as a bitmap
Bitmap
generatedImageAsBitmap
=
((
ImagePart
)
part
).
getImage
();
}
else
if
(
part
instanceof
TextPart
){
// Text content from the TextPart
String
text
=
((
TextPart
)
part
).
getText
();
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
System
.
err
.
println
(
t
);
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Provide a text prompt instructing the model to generate interleaved text and images
const
prompt
=
'Generate an illustrated recipe for a paella.\n.'
+
'Create images to go alongside the text as you generate the recipe'
;
// To generate interleaved text and images, call `generateContent` with the text input
const
result
=
await
model
.
generateContent
(
prompt
);
// Handle the generated text and image
try
{
const
response
=
result
.
response
;
if
(
response
.
candidates
?
.[
0
].
content
?
.
parts
)
{
for
(
const
part
of
response
.
candidates
?
.[
0
].
content
?
.
parts
)
{
if
(
part
.
text
)
{
// Do something with the text
console
.
log
(
part
.
text
)
}
if
(
part
.
inlineData
)
{
// Do something with the image
const
image
=
part
.
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Provide a text prompt instructing the model to generate interleaved text and images
final
prompt
=
[
Content
.
text
(
'Generate an illustrated recipe for a paella
\n
'
+
'Create images to go alongside the text as you generate the recipe'
)];
// To generate interleaved text and images, call `generateContent` with the text input
final
response
=
await
model
.
generateContent
(
prompt
);
// Handle the generated text and image
final
parts
=
response
.
candidates
.
firstOrNull
?
.
content
.
parts
if
(
parts
.
isNotEmpty
)
{
for
(
final
part
in
parts
)
{
if
(
part
is
TextPart
)
{
// Do something with text part
final
text
=
part
.
text
}
if
(
part
is
InlineDataPart
)
{
// Process image
final
imageBytes
=
part
.
bytes
}
}
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Provide a text prompt instructing the model to generate interleaved text and images
var
prompt
=
"Generate an illustrated recipe for a paella \n"
+
"Create images to go alongside the text as you generate the recipe"
;
// To generate interleaved text and images, call `GenerateContentAsync` with the text input
var
response
=
await
model
.
GenerateContentAsync
(
prompt
);
// Handle the generated text and image
foreach
(
var
part
in
response
.
Candidates
.
First
().
Content
.
Parts
)
{
if
(
part
is
ModelContent
.
TextPart
textPart
)
{
if
(
!
string
.
IsNullOrWhiteSpace
(
textPart
.
Text
))
{
// Do something with the text
}
}
else
if
(
part
is
ModelContent
.
InlineDataPart
dataPart
)
{
if
(
dataPart
.
MimeType
==
"image/png"
)
{
// Load the Image into a Unity Texture2D object
UnityEngine
.
Texture2D
texture2D
=
new
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
dataPart
.
Data
.
ToArray
()))
{
// Do something with the image
}
}
}
}
Edit images (text-and-image input)
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to edit images by prompting with text and one or more images.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
generateContent
.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Provide an image for the model to edit
guard
let
image
=
UIImage
(
named
:
"scones"
)
else
{
fatalError
(
"Image file not found."
)
}
// Provide a text prompt instructing the model to edit the image
let
prompt
=
"Edit this image to make it look like a cartoon"
// To edit the image, call `generateContent` with the image and text input
let
response
=
try
await
model
.
generateContent
(
image
,
prompt
)
// Handle the generated image
guard
let
inlineDataPart
=
response
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide an image for the model to edit
val
bitmap
=
BitmapFactory
.
decodeResource
(
context
.
resources
,
R
.
drawable
.
scones
)
// Provide a text prompt instructing the model to edit the image
val
prompt
=
content
{
image
(
bitmap
)
text
(
"Edit this image to make it look like a cartoon"
)
}
// To edit the image, call `generateContent` with the prompt (image and text input)
val
generatedImageAsBitmap
=
model
.
generateContent
(
prompt
)
// Handle the generated text and image
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide an image for the model to edit
Bitmap
bitmap
=
BitmapFactory
.
decodeResource
(
resources
,
R
.
drawable
.
scones
);
// Provide a text prompt instructing the model to edit the image
Content
promptcontent
=
new
Content
.
Builder
()
.
addImage
(
bitmap
)
.
addText
(
"Edit this image to make it look like a cartoon"
)
.
build
();
// To edit the image, call `generateContent` with the prompt (image and text input)
ListenableFuture<GenerateContentResponse>
response
=
model
.
generateContent
(
promptcontent
);
Futures
.
addCallback
(
response
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
// iterate over all the parts in the first candidate in the result object
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
Bitmap
generatedImageAsBitmap
=
imagePart
.
getImage
();
break
;
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
t
.
printStackTrace
();
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Prepare an image for the model to edit
async
function
fileToGenerativePart
(
file
)
{
const
base64EncodedDataPromise
=
new
Promise
((
resolve
)
=>
{
const
reader
=
new
FileReader
();
reader
.
onloadend
=
()
=>
resolve
(
reader
.
result
.
split
(
','
)[
1
]);
reader
.
readAsDataURL
(
file
);
});
return
{
inlineData
:
{
data
:
await
base64EncodedDataPromise
,
mimeType
:
file
.
type
},
};
}
// Provide a text prompt instructing the model to edit the image
const
prompt
=
"Edit this image to make it look like a cartoon"
;
const
fileInputEl
=
document
.
querySelector
(
"input[type=file]"
);
const
imagePart
=
await
fileToGenerativePart
(
fileInputEl
.
files
[
0
]);
// To edit the image, call `generateContent` with the image and text input
const
result
=
await
model
.
generateContent
([
prompt
,
imagePart
]);
// Handle the generated image
try
{
const
inlineDataParts
=
result
.
response
.
inlineDataParts
();
if
(
inlineDataParts
?
.[
0
])
{
const
image
=
inlineDataParts
[
0
].
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Prepare an image for the model to edit
final
image
=
await
File
(
'scones.jpg'
).
readAsBytes
();
final
imagePart
=
InlineDataPart
(
'image/jpeg'
,
image
);
// Provide a text prompt instructing the model to edit the image
final
prompt
=
TextPart
(
"Edit this image to make it look like a cartoon"
);
// To edit the image, call `generateContent` with the image and text input
final
response
=
await
model
.
generateContent
([
Content
.
multi
([
prompt
,
imagePart
])
]);
// Handle the generated image
if
(
response
.
inlineDataParts
.
isNotEmpty
)
{
final
imageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Prepare an image for the model to edit
var
imageFile
=
System
.
IO
.
File
.
ReadAllBytes
(
System
.
IO
.
Path
.
Combine
(
UnityEngine
.
Application
.
streamingAssetsPath
,
"scones.jpg"
));
var
image
=
ModelContent
.
InlineData
(
"image/jpeg"
,
imageFile
);
// Provide a text prompt instructing the model to edit the image
var
prompt
=
ModelContent
.
Text
(
"Edit this image to make it look like a cartoon."
);
// To edit the image, call `GenerateContent` with the image and text input
var
response
=
await
model
.
GenerateContentAsync
(
new
[]
{
prompt
,
image
});
var
text
=
response
.
Text
;
if
(
!
string
.
IsNullOrWhiteSpace
(
text
))
{
// Do something with the text
}
// Handle the generated image
var
imageParts
=
response
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
foreach
(
var
imagePart
in
imageParts
)
{
// Load the Image into a Unity Texture2D object
Texture2D
texture2D
=
new
Texture2D
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
imagePart
.
Data
.
ToArray
()))
{
// Do something with the image
}
}
Iterate and edit images using multi-turn chat
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
Using multi-turn chat, you can iterate with a Gemini model on the images that it generates or that you supply.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
startChat()
and sendMessage()
to send new user
messages.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Initialize the chat
let
chat
=
model
.
startChat
()
guard
let
image
=
UIImage
(
named
:
"scones"
)
else
{
fatalError
(
"Image file not found."
)
}
// Provide an initial text prompt instructing the model to edit the image
let
prompt
=
"Edit this image to make it look like a cartoon"
// To generate an initial response, send a user message with the image and text prompt
let
response
=
try
await
chat
.
sendMessage
(
image
,
prompt
)
// Inspect the generated image
guard
let
inlineDataPart
=
response
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
// Follow up requests do not need to specify the image again
let
followUpResponse
=
try
await
chat
.
sendMessage
(
"But make it old-school line drawing style"
)
// Inspect the edited image after the follow up request
guard
let
followUpInlineDataPart
=
followUpResponse
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
followUpUIImage
=
UIImage
(
data
:
followUpInlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide an image for the model to edit
val
bitmap
=
BitmapFactory
.
decodeResource
(
context
.
resources
,
R
.
drawable
.
scones
)
// Create the initial prompt instructing the model to edit the image
val
prompt
=
content
{
image
(
bitmap
)
text
(
"Edit this image to make it look like a cartoon"
)
}
// Initialize the chat
val
chat
=
model
.
startChat
()
// To generate an initial response, send a user message with the image and text prompt
var
response
=
chat
.
sendMessage
(
prompt
)
// Inspect the returned image
var
generatedImageAsBitmap
=
response
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
// Follow up requests do not need to specify the image again
response
=
chat
.
sendMessage
(
"But make it old-school line drawing style"
)
generatedImageAsBitmap
=
response
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide an image for the model to edit
Bitmap
bitmap
=
BitmapFactory
.
decodeResource
(
resources
,
R
.
drawable
.
scones
);
// Initialize the chat
ChatFutures
chat
=
model
.
startChat
();
// Create the initial prompt instructing the model to edit the image
Content
prompt
=
new
Content
.
Builder
()
.
setRole
(
"user"
)
.
addImage
(
bitmap
)
.
addText
(
"Edit this image to make it look like a cartoon"
)
.
build
();
// To generate an initial response, send a user message with the image and text prompt
ListenableFuture<GenerateContentResponse>
response
=
chat
.
sendMessage
(
prompt
);
// Extract the image from the initial response
ListenableFuture
< @Nullable
Bitmap
>
initialRequest
=
Futures
.
transform
(
response
,
result
-
>
{
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
return
imagePart
.
getImage
();
}
}
return
null
;
},
executor
);
// Follow up requests do not need to specify the image again
ListenableFuture<GenerateContentResponse>
modelResponseFuture
=
Futures
.
transformAsync
(
initialRequest
,
generatedImage
-
>
{
Content
followUpPrompt
=
new
Content
.
Builder
()
.
addText
(
"But make it old-school line drawing style"
)
.
build
();
return
chat
.
sendMessage
(
followUpPrompt
);
},
executor
);
// Add a final callback to check the reworked image
Futures
.
addCallback
(
modelResponseFuture
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
Bitmap
generatedImageAsBitmap
=
imagePart
.
getImage
();
break
;
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
t
.
printStackTrace
();
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Prepare an image for the model to edit
async
function
fileToGenerativePart
(
file
)
{
const
base64EncodedDataPromise
=
new
Promise
((
resolve
)
=>
{
const
reader
=
new
FileReader
();
reader
.
onloadend
=
()
=>
resolve
(
reader
.
result
.
split
(
','
)[
1
]);
reader
.
readAsDataURL
(
file
);
});
return
{
inlineData
:
{
data
:
await
base64EncodedDataPromise
,
mimeType
:
file
.
type
},
};
}
const
fileInputEl
=
document
.
querySelector
(
"input[type=file]"
);
const
imagePart
=
await
fileToGenerativePart
(
fileInputEl
.
files
[
0
]);
// Provide an initial text prompt instructing the model to edit the image
const
prompt
=
"Edit this image to make it look like a cartoon"
;
// Initialize the chat
const
chat
=
model
.
startChat
();
// To generate an initial response, send a user message with the image and text prompt
const
result
=
await
chat
.
sendMessage
([
prompt
,
imagePart
]);
// Request and inspect the generated image
try
{
const
inlineDataParts
=
result
.
response
.
inlineDataParts
();
if
(
inlineDataParts
?
.[
0
])
{
// Inspect the generated image
const
image
=
inlineDataParts
[
0
].
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
// Follow up requests do not need to specify the image again
const
followUpResult
=
await
chat
.
sendMessage
(
"But make it old-school line drawing style"
);
// Request and inspect the returned image
try
{
const
followUpInlineDataParts
=
followUpResult
.
response
.
inlineDataParts
();
if
(
followUpInlineDataParts
?
.[
0
])
{
// Inspect the generated image
const
followUpImage
=
followUpInlineDataParts
[
0
].
inlineData
;
console
.
log
(
followUpImage
.
mimeType
,
followUpImage
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Prepare an image for the model to edit
final
image
=
await
File
(
'scones.jpg'
).
readAsBytes
();
final
imagePart
=
InlineDataPart
(
'image/jpeg'
,
image
);
// Provide an initial text prompt instructing the model to edit the image
final
prompt
=
TextPart
(
"Edit this image to make it look like a cartoon"
);
// Initialize the chat
final
chat
=
model
.
startChat
();
// To generate an initial response, send a user message with the image and text prompt
final
response
=
await
chat
.
sendMessage
([
Content
.
multi
([
prompt
,
imagePart
])
]);
// Inspect the returned image
if
(
response
.
inlineDataParts
.
isNotEmpty
)
{
final
imageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
// Follow up requests do not need to specify the image again
final
followUpResponse
=
await
chat
.
sendMessage
([
Content
.
text
(
"But make it old-school line drawing style"
)
]);
// Inspect the returned image
if
(
followUpResponse
.
inlineDataParts
.
isNotEmpty
)
{
final
followUpImageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Prepare an image for the model to edit
var
imageFile
=
System
.
IO
.
File
.
ReadAllBytes
(
System
.
IO
.
Path
.
Combine
(
UnityEngine
.
Application
.
streamingAssetsPath
,
"scones.jpg"
));
var
image
=
ModelContent
.
InlineData
(
"image/jpeg"
,
imageFile
);
// Provide an initial text prompt instructing the model to edit the image
var
prompt
=
ModelContent
.
Text
(
"Edit this image to make it look like a cartoon."
);
// Initialize the chat
var
chat
=
model
.
StartChat
();
// To generate an initial response, send a user message with the image and text prompt
var
response
=
await
chat
.
SendMessageAsync
(
new
[]
{
prompt
,
image
});
// Inspect the returned image
var
imageParts
=
response
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
// Load the image into a Unity Texture2D object
UnityEngine
.
Texture2D
texture2D
=
new
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
imageParts
.
First
().
Data
.
ToArray
()))
{
// Do something with the image
}
// Follow up requests do not need to specify the image again
var
followUpResponse
=
await
chat
.
SendMessageAsync
(
"But make it old-school line drawing style"
);
// Inspect the returned image
var
followUpImageParts
=
followUpResponse
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
// Load the image into a Unity Texture2D object
UnityEngine
.
Texture2D
followUpTexture2D
=
new
(
2
,
2
);
if
(
followUpTexture2D
.
LoadImage
(
followUpImageParts
.
First
().
Data
.
ToArray
()))
{
// Do something with the image
}
Supported features, limitations, and best practices
Supported modalities and capabilities
The following are supported modalities and capabilities for image-output from a Gemini model. Each capability shows an example prompt and has an example code sample above.
-
Text Image(s) (text-only to image)
- Generate an image of the Eiffel tower with fireworks in the background.
-
Text Image(s) (text rendering within image)
- Generate a cinematic photo of a large building with this giant text projection mapped on the front of the building.
-
Text Image(s) & Text (interleaved)
-
Generate an illustrated recipe for a paella. Create images alongside the text as you generate the recipe.
-
Generate a story about a dog in a 3D cartoon animation style. For each scene, generate an image.
-
-
Image(s) & Text Image(s) & Text (interleaved)
- [image of a furnished room] + What other color sofas would work in my space? Can you update the image?
-
Image editing (text-and-image to image)
-
[image of scones] + Edit this image to make it look like a cartoon
-
[image of a cat] + [image of a pillow] + Create a cross stitch of my cat on this pillow.
-
-
Multi-turn image editing (chat)
- [image of a blue car] + Turn this car into a convertible. , then Now change the color to yellow.
Limitations and best practices
The following are limitations and best practices for image-output from a Gemini model.
-
Image-generating Gemini models support the following:
- Generating PNG images with a maximum dimension of 1024 px.
- Generating and editing images of people.
- Using safety filters that provide a flexible and less restrictive user experience.
-
Image-generating Gemini models do not support the following:
- Including audio or video inputs.
- Generating only
images.
The models will always return both text and images, and you must includeresponseModalities: ["TEXT", "IMAGE"]
-
For best performance, use the following languages:
en
,es-mx
,ja-jp
,zh-cn
,hi-in
. -
Image generation may not always trigger. Here are some known issues:
-
The model may output text only.Try asking for image outputs explicitly (for example, "generate an image", "provide images as you go along", "update the image").
-
The model may stop generating partway through.Try again or try a different prompt.
-
The model may generate text as an image.Try asking for text outputs explicitly. For example, "generate narrative text along with illustrations."
-
-
When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text.